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INTRODUCTION 


In  clinical  follow-up  studies,  subjects  are  monitored  at  regular  time  intervals  for  a 
medical  condition.  It  is  often  the  case  that  an  event  under  observation  can  take  place  in 
between  two  successive  visits,  and  it  may  not  be  possible  for  the  subject  to  know  the  time 
to  such  event  exactly.  For  example,  consider  the  situation  in  which  a  group  of  women 
at  high  risk  for  breast  cancer  is  asked  to  take  a  chemopreventive  substance  for  a  fixed 
time  period.  At  the  end  of  the  period,  each  participating  woman  is  required  to  submit 
a  blood  or  urine  sample  at  regular  intervals  in  order  to  monitor  the  level  of  a  validated 
intermediate  biomarker.  Let  X  denote  the  time  from  cessation  of  use  of  the  agent  to  the 
loss  of  its  protective  effect  qualified  as  a  return  to  baseline  value  of  the  biomarker.  If  a 
woman  submits  a  sample  for  assay  on  a  daily  basis,  the  value  of  X  can  be  observed  exactly, 
unless  the  protective  effect  is  still  present  by  the  time  the  study  is  terminated  so  that  X 
is  right-censored  in  the  usual  sense  of  survival  analysis.  In  practice,  however,  the  follow-up 
interval  can  be  a  week  or  longer;  therefore  the  exact  value  of  X  is  generally  unknown  but 
is  known  to  lie  between  the  time  points  L  and  R,  where  L  is  the  number  of  days  from 
cessation  of  agent  intake  to  the  last  time  the  sample  was  assayed  and  the  protective  effect 
was  still  present,  and  R  is  the  number  of  days  from  cessation  of  agent  intake  to  the  most 
recent  time  the  sample  was  assayed.  If  the  protective  effect  is  still  present,  then  R  takes 
the  value  infinity.  In  any  case,  when  the  value  of  X  is  only  known  to  lie  between  (L,  R),  we 
say  that  X  is  censored  in  the  interval  {L,R).  Therefore  the  observed  data  consist  of  either 
censoring  intervals  (L,  R)  or  exact  observations  X  =  L  =  R. 

We  consider  nonparametric  estimation  of  the  distribution  function  F{t)  of  a  real- valued 
random  variable  X  (or  its  survival  function  S{t)  =  1  —  F(t),  where  F{t)  =P{X  <  t}),  when 
the  sample  data  are  incomplete  due  to  restricted  observation  brought  about  by  interval 
censoring. 
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At  present,  there  are  only  two  estimation  procedures  of  S  for  interval-censored  data  that 
are  generalized  maximum  likelihood  estimates  (GMLE)  in  the  sense  of  Kiefer  and  Wolfowitz 
[1].  The  first  one  is  due  to  Peto  [2]  and  makes  use  of  the  Newton-Ralphon  algorithm.  The 
second  is  due  to  Turnbull  [3]  and  makes  use  of  a  self-consistent  algorithm.  In  both  cases, 
there  is  no  closed  form  expression  for  the  estimator  and  the  algorithm  is  sample  size  limiting. 

In  the  first  year  of  our  research,  we  have  focused  our  attention  on  interval-censored 
data  that  satisfy  a  condition  which  we  call  DI  condition:  data  {Li,Ri},...,{Ln,Rn}  are 
said  to  satisfy  DI  condition  if  given  any  two  censoring  intervals,  (Li,  Ri)  and  (Lj^Rj),  either 
they  are  disjoint  or  one  is  a  subset  of  the  other.  In  a  clinical  study  in  which  every  subject 
has  the  same  follow-up  schedule,  say  at  time  point  ai,  02,  ...,  Ofc,  then  {L,R}  =  {0,ai},  or 
{uj,  Uj+i}  or  {tti,  00},  and  hence  such  interval-censoring  data  will  satisfy  Condition  DI. 

Under  the  DI  interval-censorship  model,  we  extend  Efron’s  [4]  redistribution-to-the- 
right  idea  for  right-censored  data  and  propose  a  redistribution-to-the-inside  (RTI)  method  to 
yield  a  nonparametric  estimator  of  S{t)  which  we  call  redistribution-to-the-inside  estimator 
(RTIE),  denoted  by  Sj.  Such  an  estimate  has  a  closed  form  expression  and  can  be  quickly 
calculated  for  interval-censored  data  of  any  dimension. 

In  our  first  year,  we  have  accomphshed  two  important  tasks  for  Si: 

1.  We  have  implemented  a  computer  program  coded  in  the  C  language  to  carry  out  the 
RTI  procedure,  including  a  Kaplan-Meier  [5]  type  plotting  program  written  in  the  S-|- 
language  for  displaying  Sj{t). 

2.  We  have  proved  the  important  result  that  Sj  is  strongly  consistent. 

Two  completed  manuscripts,  one  pertaining  to  task  (1)  and  general  properties  of  Sj  for 
DI  data,  and  the  other  pertaining  to  task  (2),  are  being  prepared  for  submission  to  peer- 
reviewed  statistical  journals.  They  are  included  in  the  Appendix  as  part  of  our  first  year 
report. 
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BODY 


RTI  METHOD  We  here  present  the  idea  of  our  RTI  method  and  the  computer 
program  to  calculate  the  RTIE  Sj.  Denote  =  l[Lj  =  Ri],  where  1[>1]  is  the  indicator 
function  of  a  set  A.  For  convenience,  we  first  assume  that  there  are  no  ties  in  the  Lj’s  and 
in  the  R^’s.  Let  L(j)  be  the  i-th  smallest  order  statistic  among  the  L’s  and  let  be  the 
5j  associated  with  L(i),  so  are  X(i)  and  i  =  An  observation  is  said  to  be  a 

complete  observation  (CO)  in  an  interval,  (l,r),  if  either  it  is  an  exact  observation  which  is 
included  in  (/,r);  or  it  is  a  censoring  interval  which  is  contained  in  (/,r). 

Although  the  evaluation  of  Si  does  not  require  any  intensive  and  expensive  numerical 
computing,  it  does  become  tedious  when  the  sample  size  is  large.  We  have  implemented 
oiu-  first  version  of  a  computer  algorithm  to  calculate  Si  written  in  language  C.  The  main 
portion  of  the  program  is  given  in  the  following. 

We  also  include  a  Kaplan-Meier  type  plot  for  Si  from  relapse  free  survival  data  from  n  = 
374  women  with  primary  stages  I,  II  breast  cancer  treated  by  surgery.  The  corresponding 
usual  Kaplan-Meier  estimate  treating  the  interval-censored  data  as  right-censored  data  is 
also  plotted  for  comparison  purpose. 
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MAIN  PROGRAM  FOR  RTI  PROCEDURE 

V*  This  is  the  routine  to  compute  estimates  of  a  distribution  */ 

/*  interval-censored  data  are  input  from  data. in  file*/ 

/*  Three  estimates  are  computed  here  depending  on  the  selection  */ 

/*  in  the  para. in  files  */ 

/*  There  are  two  more  files:  rtie.h  and  util.c  */ 

/*  output  file  name  is  given  in  para. in  */ 

#include  <stdio.h> 

#include  <stdlib.h> 

#include  <math.h> 

#include  <malloc.h> 

#include  "rtie.h" 

iinclude  <time.h> 

int  END; 

float  INV_POW; 
float  para_gen[9] [2] ; 
float  **xy; 

main  ( )  { 

int  type [3] ,  simu_switch,  no_gp,  no_data; 

int  i,  j,  n,  power (); 

int  endl,  Ipara[4],  *index; 

int  nout,  k,  NR,  DF; 

float  R[100]; 

float  **a,  **b,  para_in[4] [2] ; 

float  c,  xlim,  eta; 

float  x[20],  nF[20]  ,  pF[20]  ,  sF[20]; 

void  l_qsort{); 

void  r_qsort ( ) ; 

void  swapO; 

void  r_swap ( ) ; 

void  s_trans(); 

void  data_form() ; 

void  weight  0; 

void  print_inf o ( ) ,  read_data(),  print_curve ( ) ,  print_percentile () 

void  print_cdf(); 

float  **dmatrix(); 

char  ofname[80],  multi_fp [80] ,  junk [80]; 
time_t  tvec  ; 

FILE  *infl,  *inf2; 

float  *F; 

for  (i=0;  i<  9;  i++) 
for  (j=0;  j<  2;  j++) 
para_gen  [i]  [j  ]  =1.; 

/*  open  1  input  data  files  */ 

infl  =  fopenC'para.in",  "r"); 

/*  Read  the  parameter  input  file  */ 
for  (i=0;  i<  12;  i++) 
fscanf(infl,  "%s%*  [^'Xn]  " ,  of  name )  ; 
fscanf(infl,  "%d%*[A\n]",  &simu_switch) ; 
fscanf(infl,  "%d%*[A\n]",  &END) ; 
for  (i=0;  i<  4;i++)  { 
fscanf(infl,  "%d",  &type[i]); 

fscanf(infl,  "%lf  %lf%* [A\n] " , &para_in [i] [0] , &para_in [i] [1]); 

} 

fscanf (infl,  "%d%* [A\n] ",  &lpara[0] ) ; 
fscanf(infl,  "%d%*[A\n]",  &Ipara[l]); 
fscanf (infl,  "%d%*[A\n]",  &Ipara[2]); 
fscanf(infl,  "%d%*[A\n]",  &Ipara[3]); 
fscanf (infl,  "%s%*[A\n]",  ofname); 
f close (inf 1) ; 

a  =  (float**)  dmatrix (0,1,0, END) ; 
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F  =  (float  *  )  malloc ( (unsigned)  (END  *  sizeof  (float))); 
'  index  =  (int*  )  malloc ( (unsigned)  (END  *  sizeof  (int) ) ) ; 

sprintf  (multi_fp,  "%s'',  ofname)  ; 
output  =  fopen (multi_fp,  "w") ; 

print_info (simu_switch,  END,  type,  para_in,  Ipara) ; 
for  (i  =  0;  i  <  END;  i++)  { 

a[0]  [i]  =  0.0; 
a[l]  [i]  =  0.0; 

F[i]  =  0.; 
index [i]  =  0; 

} 

time (Stvec) ; 

fprintf (output,  "TIME:%s\n",  ctime (Stvec) ) ; 

/*  open  input  data  file  */ 
inf2  =  fopen ("data. in" ,  "r") ; 
fscanf(inf2,  "%s%*[A\n]",  junk) ; 
if  (simu_switch  ==  0)  { 

for  (i=0;  i<END  ;  i++) 

fscanf(inf2,  "%lf  %lf%*[A\n]",  &a[0][i],  &a[l][i]); 


if  (simu_switch  >  0)  { 

i  =  0; 

while  (i  <  END)  { 
read_data(a,&i,inf2,Ipara) ; 
} 


fclose (inf 2) ; 

for  (i=0;  i<END  ;  i++) 

if  (a[0]  [i]==a[l]  [i]) 
a[l]  [i]=0.0; 

if  (simu_switch  >=  2)  { 

data_form(a,  simu_switch) ;  /*use  other  two  approaches*/ 

} 

l_qsort(a,  END,  0,  END-1); 
i=0  ; 

while  (a[0] [i]==NEGATIVE  &  a[l][i]>0.0)  i++; 
endl=i; 

r_qsort(a,  endl,  0,  endl-1) ; 
s_trans(a,  endl); 

endl  =  0; 

weight (a,  index,  Sendl,  F); 
print_cdf (a,  F,  index,  &endl,  Ipara); 


j  =  0; 

for  (i=0;  i  <  19 ;  i++) 

{  while(  (x[i]  >=  a [0] [index [ j ] ] )  &&  (j  <=  endl)) 

j++; 

nF  [i]  +=  F  [j-1]  ; 

sF[i]  +=  F[j-1]  *  F[j-1]  ; 

} 

time (Stvec) ; 

fprintf (output,  "TIME:%s\n",  ctime (Stvec) ) ; 
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CONSISTENCY.  Under  the  DI  model,  we  have  proved  the  important  result  regarding 
the  consistency  of  Sj  as  a  nonparametric  estimator  of  S  for  interval-censored  data.  Let  us 
define 


O  =  {t',t  ^  [ti,  Tr]},  Ti  =  inf{t;  P{L  <t<R}  — 1  or  t  =  -l-oo}, 

Tr  =  min{sup{t;  P{L  <  t  <  i?}  =  1},  -|-oo}. 

We  prove  that  for  any  F  and  censoring  distribution  function  G, 

lim  sup \Si{t)  —  5(t)|  =  0  a.s. 

If  n  =  -1-00,  then  O  =  [0,  oo).  Otherwise,C>  is  either  [0,t/)  (right-censorship  models)  or 
(r,.,oo)  (left-censorship  models)  or  [0,  rj)  U  (rj.,oo),  where  0  <  ti  <  Tr  <  oo.  Since  there 
are  no  observations  within  the  interval  {Ti,Tr)  (w.p.l),  thus  S{t)  is  not  estimatable  for 
te(Ti,Tr). 


CONCLUSIONS 

As  we  point  out  in  INTRODUCTION,  interval-censored  data  are  commonly  encounted 
in  breast  cancer  follow-up  studies  and  there  has  been  a  lack  of  a  computationally  feasible 
statistical  procedure  for  estimating  the  survival  function  S  even  for  studies  with  moderate 
sample  sizes.  In  our  first  year  of  research,  we  have  completed  a  computer  program  that  can 
quickly  evaluate  the  nonparametric  estimator  Sj  which  we  propose,  and  produce  a  Kaplan- 
Meier  type  plot  as  part  of  the  program.  In  the  BODY  section,  our  program  quickly  produces 
the  Sj  plot  for  overall  relapse  free  survival  for  interval-censored  data  from  374  women  with 
stages  I,  II  breast  cancer  after  treatment  by  surgery.  As  can  be  seen  from  the  plot,  there  is 
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an  appreciable  difference  between  the  usual  Kaplan-Meier  estimator  and  om  Sj  estimator. 
The  strong  consistency  that  we  have  established  for  Sj  under  the  DI  model  is  a  significant 
statistical  result.  We  now  can  reassure  users  of  Sj  that  the  estimated  value  of  S  will  be 
close  to  the  true  value  when  sample  size  is  moderate. 

Our  immediate  research  goals  for  the  second  year  are  to  extend  the  results  established 
here  to  the  case  of  non-DI  data.  Specifically,  we  will  extend  the  RTI  method  to  obtain  the 
counterpart  of  Si  for  non-DI  data.  Then  we  will  investigate  conditions  under  which  the 
corresponding  Si  can  be  GMLE  and  can  be  consistent.  We  expect  these  non-DI  extensions 
to  be  statistically  fairly  chanllenging.  However,  they  are  obviously  very  important  results, 
because  the  majority  of  interval-censored  data  in  real  applications  are  likely  to  be  non-DI 
in  nature. 
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Summary:  We  consider  nonparametric  estimation  of  a  survival  function  with  interval- 
censored  data  which  satisfy  the  condition  that  for  any  pair  of  censoring  intervals,  either  they 
are  disjoint  or  one  is  a  subset  of  the  other.  Extending  Efron’s  (1967)  idea  of  redistribution-to- 
the-right  method  for  deriving  the  Product-limit  estimator  (PLE),  we  propose  a  redistribution 
-to-the-inside  method  which  yields  an  estimate  of  the  survival  function,  given  by  a  simple, 
explicit  expression.  The  expression  reduces  to  the  PLE  under  the  right  censorship  model 
or  the  left  censorship  model.  The  new  estimator  is  shown  to  be  the  generalized  maximum 
likelihood  estimator  of  the  survival  function  in  the  sense  of  Kiefer  and  Wolfowitz  (1956), 
and  hence  is  self-consistent  in  the  sense  of  Turnbull  (1976).  Extension  of  the  RTI  method 
to  the  general  interval  censorship  model  is  also  discussed. 

1.  Introduction.  In  clinical  follow-up  studies,  subjects  are  monitored  at  regular  time 
intervals  for  a  medical  condition.  It  is  often  the  case  that  an  event  under  observation  can 
take  place  in  between  two  successive  visits,  and  it  may  not  be  possible  for  the  subject  to 
know  the  time  to  such  event  exactly.  For  example,  consider  the  situation  in  which  a  group 
of  women  at  high  risk  for  breast  cancer  is  asked  to  take  a  chemopreventive  substance  for  a 
fixed  time  period.  At  the  end  of  the  period,  each  participating  woman  is  required  to  submit 
a  blood  or  urine  sample  at  regular  intervals  in  order  to  monitor  the  level  of  a  validated 
intermediate  biomarker.  Let  X  denote  the  time  from  cessation  of  use  of  the  agent  to  the 
loss  of  its  protective  effect  qualified  as  a  return  to  baseline  value  of  the  biomarker.  If  a 
woman  submits  a  sample  for  assay  on  a  daily  basis,  the  value  of  X  can  be  observed  exactly, 
unless  the  protective  effect  is  still  present  by  the  time  the  study  is  terminated  so  that  X 
is  right-censored  in  the  usual  sense  of  survival  analysis.  In  practice,  however,  the  follow-up 
interval  can  be  a  week  or  longer;  therefore  the  exact  value  of  X  is  generally  unknown  but 
is  known  to  lie  between  the  time  points  L  and  i?,  where  L  is  the  number  of  days  from 


*  Partially  supported  by  NSF  Grant  DMS-9402561  and  DAMD17-94-J-4332. 
**  Partially  supported  by  DAMD17-94-J-4332. 
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cessation  of  agent  intake  to  the  last  time  the  sample  was  assayed  and  the  protective  effect 
was  still  present,  and  R  is  the  number  of  days  from  cessation  of  agent  intake  to  the  most 
recent  time  the  sample  was  assayed.  If  the  protective  effect  is  still  present,  then  R  takes 
the  value  infinity.  In  any  case,  when  the  value  of  X  is  only  known  to  lie  between  (L,  R),  we 
say  that  X  is  censored  in  the  interval  {L,R).  Therefore  the  observed  data  consist  of  either 
censoring  intervals  {L,  R)  or  exact  observations  X  =  L  =  R. 

We  consider  nonparametric  estimation  of  the  distribution  function  F{t)  of  a  real-valued 
random  variable  X  (or  its  survival  function  S{t)  —  1  —  where  F{t)  =P{X  <  t}),  when 
the  sample  data  are  incomplete  due  to  restricted  observation  brought  about  by  interval 
censoring. 

Interval-censored  observations  consist  of  vectors  {Li,  J?i}, ...,  {Ln,Rn},  where  Li  <  Ri, 
i  =  l,...,n.  We  assume  that  these  observations  are  i.i.d.  from  the  population  {L,R}. 
An  observation  is  said  to  be  exact  if  L  =  R  =  X,  and  is  called  a  censoring  interval  if 
R-L  >  0.  A  censoring  interval  is  said  to  be  empty  if  it  does  not  contain  exact  observations 
or  other  censoring  intervals  {Li,  Ri).  Interval-censored  data  are  said  to  be  from  a  DI  interval- 
censorship  model  if  observations  {{Lk,Rk}',  k  =  1,  ...,n}  satisfy 

Condition  DI  (Disjoint  or  Included):  Given  any  two  censoring  intervals,  {Li,Ri)  and 
{Lj,Rj),  either  they  are  disjoint  or  one  is  a  subset  of  the  other. 

To  illustrate,  consider  the  following  two  data  sets. 

(i  (2  )i  )2,  (1-1) 

where  (i  stands  for  Li  and  )i  stands  for  Ri,  i  =  1,2,  that  is,  Li  <  L2  <  Ri  <  R2', 

a).[(3(4  >4)3  )^-  (1-2) 

Data  set  (1.1)  does  not  satisfy  Condition  DI,  whereas  Data  set  (1.2)  does.  Note  that  the 
familiar  right-censored  data  satisfy  Condition  DI,  with  R  =  -l-oo  if  L  <  R,  since  (x,  -foo)  D 
(y,  -1-00)  if  X  <y.  Similarly,  the  left-censored  data  also  satisfy  Condition  DI  with  L  =  0  if 
L  <  R  and  with  half-closed  and  half-open  censoring  intervals  [0,  R). 

In  a  clinical  study  in  which  every  subject  has  the  same  follow-up  schedule,  say  at 
time  point  oi,  02,  ...,  ak,  then  {L,R}  =  {0,ai},  or  {ai,ai+i}  or  {ai,oo},  and  hence  such 
interval-censoring  data  will  satisfy  Condition  DI. 

There  is  only  one  set  of  nested  censoring  intervals  in  Data  set  (1.2).  Since  the  right- 
censored  observations  form  a  unique  set  of  nested  censoring  intervals,  it  happens  that  treat¬ 
ing  empty  censoring  intervals  as  exact  observations.  Data  set  (1.2)  is  topologically  equivalent 
to  a  set  of  right-censored  data:  Xi  <  <  X^  <  X4,  where  X^  stands  for  a  right-censored 

observation.  However,  not  all  DI  data  are  topologically  equivalent  to  right-censored  data. 
For  example,  in  the  following  DI  data. 


(1  )l  ^  (3  )3  (5  (3(7  (s  )8  )?  )6  (9  (10  )l0  )9  )5  (11  )ll  ^  (12  )l2,  (1-3) 
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there  are  two  sets  of  nested  censoring  intervals; 

^  (5  (6(7  (s  )8  )7  )6  jg  j  and  ^  (5  (9  (10  )io  )9  )5  )4  j  • 

They  are  not  disjoint  (there  are  common  censoring  intervals  in  these  two  sets),  but  they 
cannot  form  a  unique  set  of  nested  censoring  intervals  after  excluding  empty  censoring 
intervals  from  the  two  sets.  Thus,  Data  set  (1.3)  is  not  topologically  equivalent  to  right- 
censored  data. 

Peto  (1973)  and  Turnbull  (1976)  consider  the  problem  of  obtaining  the  generalized 
tnavimnm  likelihood  estimate  (GMLE)  of  the  underlying  survival  distribution  based  on 
interval-censored  data  (in  the  sense  of  Kiefer  and  Wolfowitz  (1956))  using  a  Newton-Raphson 
type  algorithm  and  a  self-consistent  algorithm,  respectively.  Bacchetti  (1990)  addresses 
some  extensions  of  Turnbull’s  approach.  Chang  and  Yang  (1987)  and  Groeneboom  and 
Wellner  (1992)  deal  with  the  problem  of  estimating  the  underlying  survival  distribution 
with  doubly-censored  data  and  study  the  corresponding  consistency  properties.  All  these 
authors,  however,  do  not  derive  a  closed-form  expression  for  their  estimator. 

It  is  worth  noting  that,  while  the  GMLE  is  unique  (Peto  (1973))  and  is  self-consistent 
(see  Tsai  and  Crowley  (1985)),  an  estimate  derived  from  the  self-consistent  algorithm  may 
not  be  unique  and  thus  may  not  be  the  GMLE.  Gu  and  Zhang  ((1993)  page  612)  give  a 
counter-example  as  follows: 

Example  1.1.  There  are  two  different  self-consistent  estimates,  for  a  doubly-censored 
data  set  with  four  observations:  Vi  =  i,  i  =  1,...,4,  where  Vi  is  exact,  V2  is  right- 
censored,  and  V3  and  V4  are  left-censored.  One  self-consistent  estimate  puts  mass  2/3 
at  1  and  mass  1/3  at  4,  and  the  other  self-consistent  estimate  puts  mass  1/2  at  both  1 
and  3.  The  second  estimate  is  the  GMLE,  but  the  first  is  not. 

Because  multiple  solutions  are  possible  in  a  self-consistent  algorithm,  Gu  and  Zhang  (1993) 
have  to  add  an  additional  assumption  in  their  theorem  for  establishing  asymptotic  normality 
(Theorem  2),  so  that  the  solution  S(t)  from  a  self-consistent  algorithm  is  indeed  the  GMLE. 
Thus  it  is  desirable  to  find  an  explicit  expression  of  the  GMLE  of  5.  Furthermore,  an  exact 
expression  of  the  GMLE  S  will  facilitate  the  establishment  of  its  asymptotic  normality  and 
the  derivation  of  its  asymptotic  variance. 

In  this  paper,  we  wiU  mainly  focus  on  finding  a  method  to  derive  an  explicit  expression 
for  the  GMLE  under  the  DI  model.  Furthermore,  we  will  study  the  possible  extension  of 
the  method  to  the  non-DI  interval-censored  data. 

Kaplan  and  Meier  (1958)  derive  the  Product-limit  estimator  (PLE)  for  right-censored 
data.  The  PLE  has  a  simple  expression,  in  contrast  to  the  numerical  solution  to  the  estimate. 
Efron  (1967)  shows  that  the  PLE  can  be  obtained  through  a  rOdistribution-to-the-right 
(RTR)  technique.  Efron’s  idea  can  be  extended  to  left-censored  data,  which  results  in  the 
PLE  with  a  simple  expression. 

In  this  paper,  we  extend  Efron’s  idea  and  propose  a  redistribution-to-the-inside  (RTI) 
method  to  yield  an  estimate  of  S{t)  with  data  from  a  DI  model.  The  new  estimate,  called 
the  Redistribution-to-the-inside  estimate  (RTIE),  has  an  explicit  expression  (see  (4.3)).  We 
show  in  this  paper  that  under  the  DI  model  the  RTIE  is  indeed  the  GMLE.  Thus,  in  this 
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case,  it  is  the  closed  form  solution  to  the  limit  of  Newton-Raphson  algorithm  studied  by 
Peto  (1973)  and  a  solution  to  the  limits  of  the  self-consistent  algorithm  proposed  by  Turnbull 
(1976).  As  a  consequence,  it  is  self-consistent  according  to  the  definition  given  in  Turnbull 
(1976).  In  particular,  it  reduces  to  the  PLE  under  the  right  censorship  model  and  under  the 
left  censorship  model.  Thus,  the  RTIE  unifies  the  expressions  of  the  PLE  with  right-censored 
data  and  left-censored  data. 

The  motivation  for  studying  the  DI  model  is  to  find  the  explicit  expression  of  the  GMLE 
for  general  interval-censored  data,  in  particular,  for  a  non-DI  data  set.  The  RTI  method 
for  the  DI  model  may  provide  some  insight  on  attacking  this  problem.  In  this  paper,  we 
modify  the  RTI  method  for  non-DI  data.  Such  an  estimator  also  has  an  explicit  expression. 
We  further  show  that  for  a  special  class  of  non-DI  data  the  estimate  derived  from  such  a 
modified  RTI  method  is  the  GMLE.  It  is  worth  mention  that  the  data  set  in  Example  (1.1) 
is  a  non-DI  data  set  and  applying  our  modified  RTI  method  results  in  the  GMLE  too. 

The  RTI  method  can  be  implemented  as  an  n-step  algorithm,  where  n  is  the  number 
of  observations.  However,  it  is  not  a  special  case  of  the  self-consistent  algorithm.  The  RTI 
method  uniquely  defines  an  estimate;  the  self-consistent  algorithm  may  result  in  different 
estimates  depending  on  the  starting  points.  The  RTI  method  takes  no  more  than  n  steps; 
the  self-consistent  algorithm  is  an  iterative  algorithm  which  stops  whenever  the  error  is 
within  a  tolerance. 

In  section  2,  we  propose  the  RTI  method.  In  section  3,  we  show  that  the  new  estimator 
is  a  GMLE  under  the  DI  Model.  In  section  4,  we  give  a  simplified  explicit  expression  of  the 
new  estimator  under  the  DI  Model. 

2.  RTI  Method.  In  this  section,  we  will  propose  a  method,  which  extends  Efron’s 
(1967)  RTR  technique  for  obtaining  the  PLE  under  the  right  censorship  model.  We  assume 
that  the  observations  satisfy  Condition  DI.  Denote  Si  =  l[Li  =  i?*],  where  1[A]  is  the 
indicator  function  of  a  set  A.  For  convenience,  we  first  assume  that  there  are  no  ties  in  the 
Li’s  and  in  the  Ri's.  Let  L(i)  be  the  i-th  smallest  order  statistic  among  the  L’s  and  let  S^i) 
be  the  Sj  associated  with  L(i),  so  are  X(i)  and  R(i)  i  =  1, ...,  n.  An  observation  is  said  to  be 
a  complete  observation  (CO)  in  an  interval,  (/,r),  if  either  it  is  an  exact  observation  which 
is  included  in  (Z,  r);  or  it  is  a  censoring  interval  which  is  contained  in  {I,  r). 

Before  we  give  an  estimator  of  S{t),  it  is  interesting  to  look  at  the  PLE  SpL{t)  under 
the  right  censorship  model  and  the  left  censorship  model. 

Under  the  right  censorship  model,  the  observations  are  (Li,5i),...,  (Ln,Sn)  and  Ri  = 
-f-oo  if  Li  <  Ri.  The  PLE  is 


spL{t)  =  n 

L(i)<t 


hi)  N 


(2.1) 


Efron’s  (1967)  introduced  the  RTR  method  to  obtain  the  PLE:  First  put  mass  1/n  to  each 
observation  Lfc.  Consider  the  smallest  censoring  time  L(j).  Since  a  death  did  not  occur 
at  L(i),  but  somewhere  to  the  right  of  it,  it  is  reasonable  to  redistribute  1/n,  the  mass  at 
L(i) ,  equally  among  all  observations  to  the  right  of  L(i)  (it  can  be  viewed  as  to  the  inside  of 
(L(j),i2(j))).  Now  consider  the  next  censored  time,  say  Ly)  (j  >  i);  redistribute  ^ 
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among  all  observations  to  the  right  of  L(j)  (it  can  be  viewed  as  to  the  inside  of  (L(j),  J?(j))). 
Treating  the  other  censored  times  similarly  results  in  the  PLE  as  in  (2.1). 

On  the  other  hand,  the  product  limit  estimator  of  the  distribution  function  F{t)  for 
the  left-censored  data  can  also  be  obtained  by  the  redistribution-to-the-left  (RTL)  method. 

It  can  be  verified  that  the  data  from  the  left  censorship  model  or  right  censorship  model 
satisfy  Condition  DI.  Then  from  the  interval-censored  data  point  of  view,  both  redistribution 
methods  can  be  unified  as  the  redistribution-to-the-inside  method.  Using  this  method,  we 
can  obtain  an  estimator  Sc{t)  of  S(t)  under  a  more  general  interval  censorship  model.  We 
start  with  the  following  example  to  help  illustrating  the  idea  of  the  RTI  method. 

Example  2.1.  Suppose  that  we  have  the  following  6  observations: 

(2  )2  (3  U  h  h  ),  (e  h 

le.,  Li  <  L2  <  R2  <  L3  <  L4  =  R4  <  L5  <  R5  <  R3  <  Ri  <  Le  <  Re  (Li  >  0  and 
Re  <  +00) •  The  data  satisfy  condition  DI.  Let  pi,  ...,P6  be  the  weights  on  the  observations 
{Li,Ri},  •  •  • ,  {Le,  Re},  respectively.  We  will  derive  pi’s  in  7  steps: 

0.  Assign  each  of  the  6  observations  weight  1/6,  i.e.,  -  1/6; 

1.  Since  {Li,i2i}  is  a  nonempty  censoring  interval,  i.e.,  the  event  occmed  somewhere 
inside  (Li,  Ri),  it  is  reasonable  to  redistribute  its  weight  p^®^  =  1/6  to  its  inside  (unless 
it  is  an  empty  interval),  that  is,  to  its  4  CO’s  {L2,  R2},  {L3,  R3})  {L4,  -R4}  ^'Hd  {L5,  R^} 
(thus  each  has  \  |  additional  weight).  Then  p^^^  =  0,  p^^^  =  •  •  •  =  p^^^  =  i(l  +  i)  =  ^ 
and  p^^^  =  1/6; 

2.  Since  {L2,R2}  is  an  empty  censoring  interval,  there  is  no  CO  inside  (Z<2,R2)-  Pi  ^’s 
remain  the  same  as  in  the  last  step; 

3.  Since  {I/3,  R3}  is  a  nonempty  censoring  interval,  redistribute  its  weight  =  i(l  +  5) 

to  its  2  CO’s  {L4,  R4}  and  {Le,  Re}-  Thus  pf\  pf'>  and  p^^^  remain  the  same  as  in  the 

last  step  and  pf^  =  0,  p^^^  =  pf^  =  6(^  +  3)[^  + 
k.  Since  L4,  Le  and  Le  are  either  an  exact  observation  or  an  empty  censoring  interval,  no 

change  is  made  on  Pi^^’s,  A:  =  4, 5, 6. 

The  values  pf\  i  =  1,  ...,6,  i.e.,  (0,  ^,0,  |)  are  the  solution  to  (pi,  ...,P6);  and 


Sc{t)  = 


'  1  if  i  €  [0,  R2) 

19/24  ifte[R2,X4) 
<  23/48  ift£  [X4,Re) 

1/6  if  te  [Re,  Re) 

^0  if  t  >  i?6 


(2.2) 


is  the  estimate  resulting  from  the  RTI  method. 

Prom  the  example,  we  can  see  that  the  method  always  redistributes  the  original  weight 
1/n  on  an  empty  interval  to  the  inside  of  the  same  interval,  but  not  to  the  outside  of  the 
interval.  For  example,  the  observation  {Le,Re}  is  to  the  right  of  Li,  same  as  {L2,i22}- 
However,  it  is  not  inside  {Li,Ri),  but  {^2,^2}  is.  The  weight  on  the  nonempty  interval 
{Li,Ri}  is  not  redistributed  to  all  the  observations  to  the  right  of  Li.  Thus,  it  is  not  a 
redistribution-to-the-right  method  or  an  RTL  method. 
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The  formal  statement  of  the  method  is  as  follows:  We  first  determine  the  weight  pi 
assigned  to  the  i-th  ordered  observation  then  the  estimator  of  S{t)  can  be  con¬ 

structed  easily.  Starting  from  the  left  of  L(i)  <  •  •  •  <  L(n),  with  initial  weight  1/n  for  each 
observation,  we  derive  PiS  in  n  steps.  At  the  fc-th  step,  if  the  k-th  ordered  observation 
is  an  exact  one  or  an  empty  censoring  interval,  do  not  make  any  change  to  the  weights 
determined  at  the  last  step.  Otherwise,  distribute  the  weight  assigned  at  the  last  step  to 
the  k-th.  ordered  observation,  which  is  a  nonempty  censoring  interval,  to  the  CO’s  in  the 
censoring  interval. 

The  following  is  the  formula  to  determine  p  =  (pi,  Denote 

jy  ^  /  #{all  CO’s  in  {L(k),R{k))}  if  <5(fc)  =  0 
^  1  0  otherwise, 

where  is  the  cardinality  of  the  set  A.  Let  the  initial  value  of  pi  be 

p(°)  =  i/n,  i  =  l,...,n. 


(2.3) 

(2.4) 


At  Step  k,  k  =  1,  ...,n. 


< 


(fc-1) 


Pk^  =  0. 


if  L(j)  is  a  CO  in  {L^k)iR{k)) 
if  L(j)  is  not  a  CO  in  (L(fc),i2(fe)) 


ifiVfe  =  0 


if  Nk  >  1. 


(2.5) 


p^  =  p{^'>  derived  from  (2.5)  is  the  weight  assigned  to  the  i-th  ordered  observation  by  the 
estimator  Sc,  *  =  1?  --j  n. 

The  estimator  Sc  is  a  probability  measure  that  assigns  positive  weight  to  each  exact 
observation  and  to  each  empty  censoring  interval;  and  assigns  no  weight  to  nonempty  cen¬ 
soring  intervals.  If  there  are  no  empty  censoring  intervals  in  the  data,  the  estimator  of 
S{t)  is  Sc{t)  =  Otherwise,  there  exists  some  k  such  that  5(^k)  =  0  and  Nk  =  0. 

It  is  well  known  that  in  such  case  the  GMLE  is  not  uniquely  determined  in  the  interval 
{L(^k),R(^k))  (see  Peto  (1973)).  For  convenience,  we  define  that  the  weight  pk  is  assigned  to 
R(^k)-  Thus  in  the  latter  case,  the  estimator  of  S(t)  is 

^c{t)  =  X]  +  XI 


We  call  Sc  the  redistribution-to-the-inside  estimator  (RTIE). 

Remark  2.1.  The  PLE  is  usually  undefined  for  t  >  L(„)  if  (5(„)  =  0  with  right-censored 
data  and  for  t  <  R^-y^  if  =  0  with  left-censored  data,  where  is  the  smallest  Rj's  and 

is  the  corresponding  S.  Expression  (2.6)  defines  Sdt)  everywhere  for  t  >  0. 

Remark  2.2.  If  there  is  a  tie  in  the  Lj’s  or  iJj’s,  we  break  the  tie  as  follows: 

1.  If  {Li,  Ri}  —  {Lj,  Rj{,  i  <  j,  then  suppose  that  Li  occurs  before  Lj-, 
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2.  If  Li  -  Lj  and  Lj  <  Rj  <  Ri,  then  suppose  that  Lj  occurs  before  Lj] 

3.  If  an  exact  observation  and  the  left  endpoint  of  a  censoring  interval  are  equal,  i.e., 
Li  =  Xi  =  Lj  <  Rj,  then  suppose  that  Xi  occurs  before  Lj. 

4.  If  Lj  <  Rj  =  Li,  then  suppose  that  Rj  occurs  before  Li. 

Consequently,  if,  for  example,  the  sample  size  is  two  and  {Li,Ri)  and  (L2,i?2)  are  equal 
censoring  intervals,  we  define  the  order  statistics  as  =  Li  and  L(2)  =  ■^'2-  Furthermore, 
we  regard  {L2,-R2}  as  a  CO  of  {Li,Ri),  but  do  not  regard  {Li,Ri}  as  a  CO  of  {L2,R2)- 
With  the  above  convention,  Sc{t)  as  in  (2.6)  is  well  defined  even  when  there  are  ties  in 
the  Li’s  or  in  the  Ri's. 

3.  Generalized  MLE.  We  first  define  the  GMLE.  Kiefer  and  Wolfowitz  (1956)  sug¬ 
gested  that  for  a  given  nondominated  family  of  probability  measure  V  one  can  define  a  gen¬ 
eralized  maximum  likelihood  estimator  as  follows:  For  Pi,  P2  in  P,  let  /(£;  Pi,  P2)  =  ^(^)» 
the  Radon-Nikodym  derivative  of  Pi  with  respect  to  Pi  -b  P2.  If  ^  represents  the  observed 
data  vector,  P  is  a  GMLE  if  and  only  if 

f(x-,  P,  P)  >  fix-,  P,  P)  for  all  P  in  V.  (3.1) 

It  is  desirable  that  the  RTIE  Sc  is  a  GMLE.  It  turns  out  that  this  is  true  under  the  DI 
Model. 

Hereafter,  we  denote  the  lower  case  letters  the  values  of  the  corresponding  random 
variables.  As  discussed  in  Tsai  and  Crowley  (1985),  the  definition  of  the  GMLE,  P,  of  an 
unknown  probability  measure  P,  reduces  to  P{x)  >  P(^),  where 

P(x)  =  f[P{X  =  ;(()}*“> P{X  €  (/((),>■(<))}'"*“'■  (3-2) 

i=l 

X  is  the  random  variable  with  the  distribution  function  F{t)  and  the  lower  case  letters  L’s 
are  the  values  of  corresponding  random  variables  Lj’s.  In  view  of  Remark  2.2,  without  loss 
of  generality  (WLOG),  we  can  assume  that  there  exist  no  ties  in  Lj’s  and  Pj’s.  Note  that 
the  likelihood  P{X  =  ^(j) P{ A  G  (^(i),r’(i))}^"‘^'‘b  for  each  observation  depends  only  on 
the  values  of  F{t)  at  the  L(i)  and  R^)  (see  Peto  (1973)).  Let  P  assign  probability  Pi  to  /(j) 
if  5(j)  =  1  and  to  the  set  »*(i)]  \Uj>i{[/(j),  r-(j)]},  if  5(i)  =  0,i  =  1, ...,  n,  where  “\”  stands 
for  set  minus.  Given  an  i,  the  likelihood 


P{X  =  e  {i(i),r(i))}'-*“>  =pf“’(  E 

i+Mi 

’  =  S  PL 

(3.3) 

j=i 

where  ......  ^ 

j=i 

*  1 0  otherwise. 

Then  (3.2)  is  equal  to 

n  n 

(3.4) 

i=l  j=:i  i=l 

(3.5) 
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It  is  well-known  that  if  a  censoring  interval  {lj,rj)  is  empty,  the  likelihood  function  L 
would  not  change  as  long  as  the  weight  on  the  interval  remains  the  same  (see  Peto  (1973)). 
Thus  the  definition  of  GMLE  for  t  within  the  empty  censoring  interval  (Li,  Ri)  needs  not 
be  unique. 

Theorem  1.  Suppose  that  the  interval- censored  data  satisfy  Condition  DI.  Then  the 
RTIE  Sc{t)  (as  in  (2.6))  of  S{t)  is  a  GMLE. 

Proof:  In  order  to  prove  the  theorem,  it  suffices  to  show  the  following  statement: 

(Sn)  the  (pi,  ...,Pn)  that  maximizes  L  as  in  (3.5)  is  the  same  as  the  (p^^\  ...,p^^)  determined 
by  (2.4)  and  (2.5). 

We  prove  it  by  induction  on  the  sample  size  n. 

When  the  sample  size  n  equals  1,  L  as  in  (3.5)  is  maximized  by  pi  =  1  and  (2.5)  yields 
=  1,  Thus  the  theorem  is  trivially  true. 

Now  assume  that  the  statement  (Sn)  is  true  for  all  sample  sizes  n  <m.  We  will  show 
that  the  theorem  holds  also  for  n  —  m. 

When  sample  size  is  m,  in  view  of  Remark  (2.2),  WLOG,  we  can  assume  that 

Li<---<Lm.  (3.6) 


Since  Condition  DI  is  satisfied,  one  of  the  following  occurs: 

(1)  Li  <  Li  <Ri<Ri,i  =  2, ...,  m. 

(2)  the  data  set  can  be  partitioned  into  2  disjoint  subsets,  with  ni  observations  in  the  first 
subset  (1  <  ni  <  m)  and  n2  in  the  second  subset  (ni  -1-  n2  =  m). 

By  disjoint  subsets,  we  mean  max{i2i;  i  <  ni}  <  min{Lj;  i  >  ni}. 

We  first  assume  that  case  (2)  is  true.  Let  p  be  the  sum  of  the  weights  assigned  to 
the  elements  of  the  first  subset  and  q  the  one  to  the  elements  of  the  second  subset.  Then 
p-{-q=  1.  Note  that  the  likelihood  function 

m  m  rii  i+Mi  m 

L =[!!(  E  j>i)i  ■  [  n  ( E  n)] = in(  E  7)1  -p”'  •  I  n  ( E  71  ■ 

i=l  j=:i  i>ni  j=i  i>ni  j=i 


It  is  easy  to  see  that  L  is  maximized  by  maximizing  its  three  factors  : 


ni  i+Mi  m  i+Mi 

IKE  7).  IKEy) 

i=l  j=:i  ^  i>n\ 


(3.7) 


Let  Pi  =  pi/p,  then  YllLiVi  =  1  i Mi  <  ni  ior  i  =  l,...,ni.  Thus,  the  first 
product  can  be  viewed  as  the  likelihood  11”=  1  Vj  interval-censored  data  with 

ni  observations:  ...,  {Ln^^Linx}-  Since  ni  <  m,  by  the  induction  assumption  (Sn), 

the  first  product  is  maximized  by  yf\  i  =  1, ...,  ni,  determined  by  rule  (2.5)  for  the  sample 
size  n\  (and  by  substituting  pi  =  Pi  in  (2.5)). 

Let  Zi-nx  =  Pi/?)  i  =  Ui-l-1, ...,  m.  In  a  similar  manner,  it  can  be  shown  that  the  second 
product  in  (3.7)  is  maximized  by  *  =  1)  •••)«2  determined  by  rule  (2.5)  for  sample  size 
n2  with  observations  ...,  {Lm,Rm}  (and  by  substituting  =  Zi  in  (2.5)). 
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Since  the  third  product  in  (3.7)  is  p^i(l  -  it  is  maximized  by  setting  p  =  ni/m 
and  l—p  =  q  —  n2/m. 

To  complete  the  proof  for  case  (2)  when  the  sample  size  is  m,  it  suffices  to  verify  that 

(1)  For  the  same  data,  the  weights  ...,p£r^)  determined  by  (2.5)  with  n  =  m, 

satisfy  -\ - PnT^  =^n\/m  and  H - hp^^  —n^/m. 

(2)  The  weights,  pf  \  i  =  l,...,m,  determined  by  (2.5)  for  sample  size  n  =  m,  satisfy 
that  Vi  =  pP/p,  i  =  1, -^ni,  is  the  weight  determined  by  (2.5)  for  the  sample  size  n  =  ni 

with  observations  {Li,Ri},  ...,  {Lni,Rni}j  ^i-ni  =  pP/q^  i  =  ni  +  l,...,m,  is  the 
weight  determined  by  (2.5)  for  the  sample  size  n  =  n2  with  observations  {1,^+1, 

•  ••>  {Rmi  Rm}' 

We  first  prove  statement  (1).  Note  that  none  of  the  observations  {jL„i+i,i?„i+i},  ..., 
{Lm,Rm}  is  a  CO  of  any  of  the  possible  censoring  intervals  {Li,Ri),  i  <  ni.  Thus  the 
RTI  method  will  not  move  any  of  the  original  weight  1/m  on  i  <  ni,  to  {Lj,Rj}, 

j  >  ni.  On  the  other  hand,  none  of  the  observations  ...,  {Ln^,Rn^}  is  a  CO  of 

any  of  the  possible  censoring  intervals  (L^,  R^),  i>ni.  Thus  the  RTI  method  will  not  move 
any  of  the  original  weight  1/m  on  {Li,Ri},  i  >  ni,  to  {Lj,Rj},  j  <  ni.  It  follows  that 

pf'  +  ■  ■  ■  +  =  ESiPf  =  Efei  l/">  =  m/m  and  pi”VY>  +  ■■■+)>&•>=  m/m.  Thus 

statement  (1)  holds. 

In  the  following,  we  prove  statement  (2).  Let  pP ,  i  =  1,  ...,m,  be  the  values  of  pi 
determined  by  (2.5)  (when  n  =  m).  Then  for  i  <  ni,  multiplying  ^  on  both  sides  of  (2.4) 
and  (2.5)  yield: 

—pP  =  —i/n  =  1/ni,  i  =  1,  ni]  (3.8) 

ni  *  ni 

and  for /i:  =  1,  ...,ni. 


m  _  m  A 

mPk  ~ 

^,pP  =  1 ^  ^  '''  ihk),R{k)) 

,  V  hi)  is  a  iL(k),R-{k)) 

Let  yp^  =  ^Pp\  for  possible  i  and  k,  then  (3.8)  and  (3.9)  yield 

yW  =  l/m,  i  =  l,...,ni, 

and  for  fe  =  1, ...,  ni. 


if  Nfc  =  0 


if  Nk  >  1. 


(3.9) 


(3.10) 


f  yP^  =  yP  =  k,...,ni, 


ifiVfc  =  0 


yP  = 

pf )  =  pf-i)  +  ^  if  L(i)  is  a  CO  in  (L(,),  R(fc))  if  >  1. 

yp^  =  yp"^^  if  hi)  is  ^  i^  {L{k),Rik)) 


(3.11) 
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Note  that  (3.10)  and  (3.11)  are  identical  to  (2.4)  and  (2.5),  respectively,  provided  that  the 
observations  are  {Li,i2i},  ...,  and  n  =  ni.  This  proves  that  the  weights, 

*  =  1, ...,  m,  assigned  by  Sc  for  sample  size  m,  satisfy  that  /Pi  *  =  1)  — ^  ni,  is  the 

weight  assigned  by  Sc  for  the  sample  size  ni  with  observations  {Li,Ri},  ...,  {Lm.Rni}- 
In  the  similar  manner,  we  can  show  that  Zi-^  =  pf^ /Qi  i  =  ni  +  l,...,m,  is  the  weight 
assigned  by  Sc  for  the  sample  size  n2  with  observations  •••,  {Lm,  Rm}-  This 

completes  the  proof  of  statement  (2)  and  the  proof  for  case  (2). 

To  complete  the  proof  for  the  case  n  =  m,  we  need  to  show  that  Sc  is  the  GMLE  when 
case  (1)  is  true,  i.e.,  {Lj,Rj)  C  (Ti,  J?i)  for  all  j  >  1.  Note  that  if  the  latter  is  true,  the 
likelihood  function  (3.5)  is  equal  to 

m  m  ra 

i—l  j=2  k=j 

Fixing  Pi +P2i  b  increases  by  setting  pi  =  0.  That  is, 

(Bl)  the  solution  of  the  GMLE  for  pi  is  0. 

When  Pi  =  0 

m  m  m 

b=(^Pi)n( 

i=2  j=2  k=j  i=2 

m 

j=2  k=j 

The  likelihood  is  the  same  as  the  one  for  the  sample  size  m  -  1,  with  observations  (L2,  R2), 
...,  (Lm,i2^).  Thus 

(B2)  the  solution  of  the  GMLE  for  (p2,  ...,Pm),  is  the  same  as  the  solution,  (p*i,  ....,p*to-i), 
of  the  GMLE  with  m  -  1  observations  {1/2,  ij!2 },•••,  (L  m )  Rm  }  • 

We  now  show  that  (2.5)  yields  (Bl)  and  (B2)  too.  Note  that,  for  A:  =  1,  since  Ni  =  m-l, 
(2.5)  yields 

=  0,  (3.12) 

which  is  the  same  as  in  (Bl).  Furthermore,  for  A:  =  1,  (2.5)  yields 

p,-^^  =  —  +  —/('m  -  1)  =  *  =  2, ...,  m. 

This  can  be  viewed  as  (2.4)  with  n  =  m  -  1  and  observations  {L2,  R2},  •••,  {Lm-,  Rm},  say, 

*  =  ii-)W-i.  (3.13) 

TTv  i 

Then 

(C)  the  {p^\  ...,p^^)  determined  by  (2.4)  and  (2.5)  for  sample  size  n  =  m  and  A:  =  2, ...,  m 

is  the  same  as  (pii\ -.iPlm-V)!  determined  by  (2.4)  (or  (3.13))  and  (2.5)  for  sample 
size  n  =  m  -  1  (and  replacing  pi  by  p*i),  with  observations  {L2,  R2},  •••,  {Lm,  Rm}- 
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By  the  induction  assumption  (Sn),  and  conclusions  (B2)  and  (C),  the  solution  of  the  GMLE 
for  (p2,  -,Pm),  is  the  same  as  the  determined  by  (2.4)  and  (2.5).  It  indicates 

that  Sc  is  the  same  as  the  GMLE  for  the  data  of  sample  size  m.  This  completes  the  proof 
for  case  (1).  It  also  concludes  the  proof  of  statement  (Sn)  for  n  =  m  and  the  proof  of  the 
theorem,  n 

An  estimate  S  with  interval-censored  data  under  DI  model  is  self-consistent  if  it  satisfies 


S{t) 


n 


Li<t<R 


S{t)  -  S{Ri) 
,S{Li)-S{Riy 


It  is  worth  mention  that  for  doubly-censored  data,  the  corresponding  equation  is  different. 
A  GMLE  is  a  self-consistent  estimate.  It  follows: 

Corollary.  The  estimator  Sc  is  self-consistent. 


4.  A  simple  explicit  expression  of  the  RTIE  under  the  DI  Model.  It  is  expected 
that  the  RTIE  Sc  has  a  simple  expression  hke  (2.1)  for  the  PLE.  Under  the  DI  Model,  the 
estimator  Sdt)  can  be  expressed  in  the  following  form:  First  note  that  if  (L(i),i2(j))  and 
(L(j),i2(j))  (i  <  j)  are  two  censoring  intervals  that  contain  t,  then  Condition  DI  implies 
that  {LQ),i2(j)}  is  a  CO  in  Given  t,  referring  Nk  as  in  (2.3),  define 

jy*  =  /  #{  all  GO’S  in  (L(fc),  t]}  if  5^k)  =  0  and  ^  t  ^ 

^  1 0  otherwise 


and 


Then 


M)  =  iM  >  01. 


1  -  ss)  =  E 


l[t  >  Ri 


i=l 


n 


(4.2) 

(4.3) 


where  Xk  =  l  for  any  Xk  and  =  1  for  all  a;  >  0.  Let  <  •  •  •  <  be  the  indices  of 
all  (ordered)  censoring  intervals  that  contain  t  (so  that  ti,  ...,£m  depend  on  t)  and  for  which 
(L(t^),i?(t^))  is  not  empty.  Then  (4.3)  equals 


1=1  1=1  k=l  * 


Nf 


(4.4) 


Expression  (4.4)  is  another  way  to  express  the  idea  of  the  redistribution  -  to  -  the  -  inside 
method.  The  first  term  in  (4.4),  Yli  Is  the  fraction  of  the  CO’s  in  (-oo,  -l-oo)  which 

are  in  (— oo,  £],  i.e.,  it  is  the  empirical  weight  carried  by  the  CO’s  which  are  <  t.  Each  of  the 

JV* 

next  m  summands  in  (4.4)  has  two  parts:  is  the  fraction  of  the  CO’s  in  the  censoring 

interval  (L(i^.), .))  which  are  in  (L(f^.),  £].  The  quantity,  J  [ni=i(l  +  T^)] ^  Is  the  weight 
accumulated  at  the  j-th  nonempty  censoring  interval  up  to  the  (j  —  l)-th  step.  Thus  the 
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j-th  summand  is  the  weight  distributed  to  the  total  of  the  CO’s  in  the  interval 

by  the  censoring  interval  This  indicates  that  expression  (4.4)  is  the  same  as 

expression  (2.6)  under  the  DI  Model. 

It  can  be  shown  that  expression  (4.4)  reduces  to  (2.1)  with  right-censored  data  and 
reduces  to  the  PLE  with  left-censored  data.  Thus  expression  (4.3)  or  (4.4)  is  the  unified 
expression  which  includes  the  PLE. 
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Abstract:  Yu  and  Wong  (1993)  propose  a  redistribution-to-the-inside  method  to 
derive  an  explicit  expression  for  the  generalized  maximum  likelihood  estimator  of  an  un¬ 
known  distribution  function  F  with  interval-censored  data  which  satisfy  that  given  any 
pair  of  censoring  intervals  either  they  are  disjoint  or  one  is  a  subset  of  the  other.  We  call 
such  model  a  DI  model.  Both  the  right  censorship  model  and  the  left  censorship  model 
are  special  cases  of  the  DI  model.  Thus,  in  the  latter  cases  the  expression  is  exactly  the 
Product-limit  estimator.  In  this  paper,  we  derive  the  uniformly  almost  sure  limit  of  the 
estimator  on  [0,  -f  oo)  for  any  arbitrary  F  and  any  arbitrary  censoring  distribution  function 
G  under  the  DI  Model. 

1.  Introduction.  We  consider  nonparametric  estimation  of  the  distribution  function 
F  of  a  real- valued  random  variable  X  (or  its  survival  function  S{t)  =  1-F(t)  =  P{X  >  t}), 
when  the  sample  data  are  incomplete  due  to  restricted  observation  brought  about  by 
interval  censoring. 

Suppose  that  {Yi,  Li,  Ri}, ...,  {Xn,  Ln,  Rn}  are  i.i.d.  random  vectors  from  a  popula¬ 
tion  {X,  L,  R}  and  that  X  and  {L,  R}  are  independent.  We  only  observe  X  if  X  ^  (L,  R); 
otherwise,  we  only  observe  {L,  R}.  Denote  G{1,  r)  the  joint  distribution  function  of  {L,  R} 
and  V  the  set  of  all  the  possible  values  (/,r)  of  random  interval  {L,R).  Denote 


r{X,X}  iiX^iL,R) 
^  ( {L,  R}  otherwise. 


(1.1) 


and  denote  i  =  l,„.,n,  in  an  obvious  way.  Then  ...,  are 

interval-censored  observations.  An  observation  is  said  to  be  exact  if  L?  =  R*  (in  which 


*  Partially  supported  by  NSF  Grant  DMS-9402561  and  DAMD17-94-J-4332. 

**  Partially  supported  by  DAMD17-94-J-4332. 
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case,  it  equals  Xj)  and  is  called  a  censoring  interval  if  R*-L*  >  0.  Interval-censored  data 
are  said  to  be  from  a  DI  interval-censorship  model  if  observations  {{L*,R*}-,  i  =  1, 
satisfy 

Condition  DI  (Disjoint  or  Included):  Given  any  two  censoring  intervals,  m.R^) 

and  {Lj,  Rj),  either  they  are  disjoint  or  one  is  a  subset  of  the  other. 

The  following  are  examples  that  data  satisfying  condition  DI  may  arise. 

Example  1.1.  Suppose  that  oi,  02,...  are  scheduled  check-up  times  for  patients  of 
some  disease  (0  <  ai  <  02  <  •  •  •  and  0  corresponds  to  the  first  visit  of  each  patient).  Every 
patient  is  followed  according  to  this  schedule  to  monitor  the  disease  status.  We  either 
know  the  exact  survival  time  X  of  the  patient  or  the  patient  failed  to  show  up  since  a 
scheduled  check-up,  resulting  in  an  L,  which  is  the  time  the  patient  last  appeared,  thus  L 
takes  value  Oi  e  {0,ai,a2, In  the  latter  case,  either  the  patient  is  lost  to  follow-up  so 
that  R  -  -1-00,  or  it  is  learned  at  time  Uj+i  (when  he  missed  the  appointment)  that  the 
patient  died  before  R  so  that  R  €  (ai,Oi.).i]. 

Example  1.2.  In  a  cancer  follow-up  study,  patients  are  monitored  for  the  status  of 
a  clinical  outcome,  such  as  relapse  or  disease  progression,  at  scheduled  time  points.  When 
the  inter-foUow-up  interval  is  wide,  say  a  few  months,  and  the  outcome  status  requires  a 
careful  objective  clinical  assessment,  it  may  not  be  possible  to  know  the  exact  value  of  the 
time-to-event  variable  X  (for  instance,  time  from  achievement  of  a  complete  response  to 
disease  progression  as  determined  by  rigorous  pathological  findings)  for  some  patients.  The 
reason  for  this  is  that  the  event  can  take  place  sometime  between  the  last  and  current  visits 
without  the  patients  noticing  any  changes  until  they  are  examined  at  the  current  follow-up. 
For  these  patients,  their  X  values  are  known  only  to  lie  in  an  interval  and  interval-censored 
data  are  obtained.  In  particular,  if  the  schedule  is  the  same  for  all  patients,  DI  data  are 
obtained. 

Note  that  the  familiar  right-censored  data  satisfy  Condition  DI,  with  R  =  H-oo  if 
L  <  R,  since  (a:,  -f  00)  D  {y,  4-oo)  ii  x  <  y.  Similarly,  the  left-censored  data  also  satisfy 
Condition  DI  with  L  =  —00  ii  L  <  R. 

Peto  (1973)  and  Turnbull  (1976)  consider  the  problem  of  obtaining  the  generalized 
TnaviTmiTn  likelihood  estimate  (GMLE)  of  the  underlying  survival  distribution  based  on 
interval-censored  data  (in  the  sense  of  Kiefer  and  Wolfowitz  (1956))  using  a  Newton- 
Raphson  type  algorithm  and  a  self-consistent  algorithm,  respectively.  Chang  and  Yang 
(1987)  and  Groeneboom  and  Wellner  (1992)  deal  with  the  problem  of  estimating  the 
underlying  survival  distribution  with  doubly-censored  data  and  study  the  corresponding 
consistency  properties.  Gu  and  Zhang  (1993)  establish  the  strong  uniform  consistency, 
asymptotic  normality  and  asymptotic  efficiency  of  the  self-consistent  estimator  under  mild 
conditions  on  the  distribution  of  censoring  variables  with  doubly-censored  data.  All  these 
authors,  however,  do  not  derive  a  closed-form  expression  for  their  estimator. 

Kaplan  and  Meier  (1958)  derive  the  Product-limit  estimator  (PLE)  for  right-censored 
data.  Efron  (1967)  shows  that  the  PLE  can  be  obtained  through  a  redistribution-to-the- 
right  technique.  Extending  Efron’s  idea,  Yu  and  Wong  (1993)  propose  a  redistribution-to- 
the-inside  (RTI)  technique,  which  unifies  the  redistribution-to-the-right  technique  and  the 
redistribution-to-the-left  technique  (Gomez  et  al.  (1992)),  and  obtain  an  estimate  with 
DI  interval-censored  data.  The  estimator,  called  the  RTIE  and  denoted  by  Si(t),  has  an 
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explicit  expression  and  is  the  GMLE  under  the  DI  Model  (Yu  and  Wong  (1993)). 

Gu  and  Zhang  ((1993)  page  612)  give  an  example  that  a  self-consistent  algorithm  may 
result  in  multiple  solutions  for  self-consistent  estimate  and  a  self-consistent  estimate  may 
not  be  the  GMLE.  Thus  it  is  edsirable  to  find  the  explicit  expression  of  the  GMLE.  The 
motivation  for  studying  the  DI  model  is  to  find  the  explicit  expression  of  the  GMLE  for 
general  interval-censored  data,  in  particular,  for  a  non-DI  data  set.  The  RTI  method  for 
the  DI  model  may  provide  some  insight  on  attacking  this  problem.  Yu  and  Wong  (1994) 
modify  the  RTI  method  for  non-DI  data.  The  method  also  results  in  an  explicit  expression 
of  the  RTIE.  They  further  show  that  for  a  special  class  of  non-DI  data  the  estimate  derived 
from  the  modified  RTI  method  is  the  GMLE. 

Under  the  DI  Model  Si{t)  is  the  closed  form  solution  to  the  limit  of  Newton-Raphson 
algorithm  studied  by  Peto  (1973)  and  a  solution  to  the  limits  of  the  self-consistent  algorithm 
proposed  by  Turnbull  (1976).  As  a  consequence,  it  is  self-consistent  according  to  the 
definition  given  in  Turnbull  (1976).  In  particular,  it  reduces  to  the  PLE  under  the  right 
censorship  model  and  under  the  left  censorship  model.  We  only  consider  DI  models  in  this 
paper. 

We  derive  the  almost  sure  limit  of  Si{t)  uniformly  on  [0,  -|-oo)  for  any  arbitrary  F  and 
G  (see  Theorem  4.2  and  Remark  5.1).  In  proofs,  we  use  a  real  analysis  approach.  In  light  of 
the  literature,  it  is  conceivable  that  if  we  use  a  stochastic  process  approach  or  martingale 
approach  (see,  for  example.  Gill  (1983)  or  Stute  and  Wang  (1993)),  the  proof  may  be 
shorten.  However,  using  such  approach,  additional  assumptions  on  F  or  G  are  needed. 
For  example,  Stute  and  Wang  (1993)  use  a  martingale  approach  to  show  that  with  right- 
censored  data  the  PLE  is  strongly  consistent  uniformly  on  the  interval  t  <  r,  for  any 
arbitrary  F  and  G,  provided  F  and  G  do  not  have  any  discontinuous  points  in  common, 
where  r  =  inf{s;  F{s)  =  1  or  G(s, -l-oo)  =  1}.  Using  our  approach,  we  do  not  have 
any  assumption  on  F  and  G.  Furthermore,  our  main  results  imply  a  stronger  result  than 
Gomez  et  al.  (1992)  result  on  the  consistency  of  the  PLE  with  left-censored  data:  they 
show  that  the  PLE  with  left-censored  data  is  strongly  consistent  uniformly  on  i  >  to  for 
any  arbitrary  to,  F  and  G,  provided  F(to)  >  0  and  G(-|-oo,to)  >  0.  Our  main  results 
imply  that  the  PLE  with  left-censored  data  is  strongly  consistent  uniformly  on  t  >  for 
any  arbitrary  F  and  G,  where  =  inf{r;  G(-|-oo,  r)  >  0}. 

In  Section  2,  we  define  the  notation.  In  Section  3,  we  give  a  consistency  proof  when  G 
is  discrete.  In  Section  4,  we  give  the  main  consistency  results  (Theorem  4.2).  In  Section  5, 
we  discuss  the  almost  sure  limit  on  the  half  real  fine.  Some  proofs  of  lemmas  and  theorem 
are  put  in  the  Appendix.  For  the  convenience  of  readers,  we  give  details  in  proofs,  which 
may  be  condensed  in  a  future  revision.  In  particular,  the  Appendix  could  be  deleted  in  a 
future  revision,  since  the  main  idea  of  the  consistency  proof  for  an  arbitrary  G  is  in  the 
proof  of  Theorem  3.1  for  a  discrete  G. 

2.  Notation  and  the  GMLE.  Hereafter,  we  assume  that  observations  are  from  a  DI 
Model.  Denote  Si  =  l[Xi  ^  (Li,Ri)],  where  1[A]  is  the  indicator  function  of  a  set  A.  For 
convenience,  we  first  assume  that  there  is  no  tie  in  the  Li’s.  Let  be  the  i-th  smallest 
order  statistic  of  L|’s  and  let  be  the  Rj  associated  with  L*^^,  and  denote  5(i),  X(j), 
L(j)  and  R^i)  in  a  similar  manner,  i  =  1,  An  observation  is  right-censored 

if  Ri  =  -hoo  and  is  left-censored  if  Lt  =  -oo  and  L*  <  R^.  A  censoring  interval  is  said 
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to  be  empty  if  it  does  not  contain  exact  observations  or  other  censoring  intervals  (L^,  R^). 
{Ll,Rl}  is  said  to  be  a  complete  observation  (CO)  in  an  interval  if  either  it  is  an  exact 
observation  which  belongs  to  the  interval  or  it  is  a  censoring  interval  which  is  a  subset  of 
the  interval. 

The  RTI  method  works  as  follows.  Initial  weight  1/n  is  assigned  to  each  observation. 
Then  starting  from  if  it  is  a  nonempty  censoring  interval,  i.e.,  the  event  oc- 

cured  somewhere  inside  it  is  reasonable  to  redistribute  its  weight  equally  to  the 

observations  inside  the  interval;  otherwise  do  not  make  any  change.  Treating 

i  =  2, 3, ...,  n,  similarly  results  in  the  RTIE  Si{t). 

Write  Fi{t)  =  1  —  Si{t).  Yu  and  Wong  show  that 


Mt)  =  E 


i[jjr  <  «j 


Nj 


i=l 


n 


where 


^  I  #{aU  CO’8  in  (L;,),  iJft,)}  if  =  0 
^  0  otherwise, 

4I^A  is  the  cardinality  of  a  set  A, 

#{all  CO’S  in  !]}  if  =  0  and  <  t  < 
^  1  0  otherwise  , 


(2.1) 


(2.2) 


=  l[iV^  >  0],  =  I  for  all  x  >  0  and  ni<fc<i(l  +  =  1-  Under  the  DI  Model, 

Yu  and  Wong  (1993)  show  that  Si(t)  is  the  GMLE  of  S.  It  is  worth  noting  that  Fi(t)  is  right 
continuous,  nondecreasing  in  t  and  bounded  by  [0,1].  It  is  well  known  (see  Peto  (1973)) 
that  the  GMLE  is  imiquely  defined  in  terms  of  weights  on  exact  observations  or  on  empty 
intervals,  but  is  not  uniquely  defined  for  t  within  an  empty  censoring  interval.  However,  the 
definition  of  the  GMLE  for  t  in  the  following  three  cases  affects  its  property  of  the  uniformly 
strong  consistency:  (1)  the  largest  order  statistic  is  right-censored;  (2)  mini{R|}  is 
left-censored;  or  (3)  is  an  empty  censoring  interval  and  R*^^^  =  mini{R|}. 

(See,  for  example,  Yu  and  Li  (1994).)  In  particular,  if  case  (2)  (or  case  (3))  is  true  and 
Si{t),  t  <  miniR*  (or  t  €  [I'(„))^2(„)]))  is  defined  as  in  (2.1),  then  Si{t)  is  not  uniformly 
strongly  consistent  (for  t  <  r  Delete  the  following)  for  t  E  O.  Thus  we  use  the  following 
modification. 

Remark  2.1.  If  either  case  (2)  or  case  (3)  occurs,  we  modify  (2.1)  as  follows:  in  case 
(2),  Si{t)  =  5/ (min*  i2|)  for  t  <  minj  R^]  in  case  (3),  define  Si{t)  to  be  a  right  continuous 
step  function  with  a  unique  jump  at  the  median  of  and  for  t  €  ^(n)]- 

Remark  2.2.  If  there  are  ties  in  the  L^’s,  neither  Nk  nor  expression  (2.1)  is  well 
defined.  In  such  cases,  we  break  the  ties  as  follows: 

1.  If  {L|',R*}  =  {L’j,Rj},  i  <  j,  then  suppose  that  occurs  before  L^; 

2.  If  L?  =  L|  and  Lj  <Rj  <  Rl,  then  suppose  that  L?  occurs  before  L]; 

3.  If  an  exact  observation  and  the  left  endpoint  of  a  censoring  interval  are  equal,  i.e., 
L*  =  Xi  =  Lj  <  Rp  then  suppose  that  Xi  occurs  before  L|. 
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4.  If  Lj  <  L*  =  Rj,  then  suppose  that  Rj  occurs  before  L*. 

Thus,  for  example,  if  the  sample  size  is  2  and  the  two  censoring  intervals  {Ll,Rl)  and 
(L2,i?2)  equal,  we  define  the  order  statistics  as  =  L|  and  L^2)  —  -^2? 
regard  {L2,i?2}  ^  ^  of  but  do  not  regard  {Ll,Rl}  as  a  CO  of 

Recall  that  under  the  right  censorship  model,  with  probability  1  (w.p.l)  there  are  no 
exact  observations  for  t>r,  thus,  people  only  study  the  consistency  of  the  PLE  for  t  <t. 
A  similar  condition,  namely,  t  G  O,  occurs  in  interval-censored  data,  where 

0  =  [n,  Tr]},  Ti  =  mi{t;  P{L  <t<R}  =  loit  =  H-oo}, 

Tr  =  min{sup{t;  P{L  <t<R}  =  l},  -l-oo}.  (2.2) 

If  Ti  =  -foo,  then  O  =  [0,oo).  Otherwise,  O  is  either  [0,r/)  (right-censorship  models)  or 
{Tr,  oo)  (left-censorship  models)  or  [0,  n)  U  (r^,  oo),  where  0  <  r/  <  <  oo.  There  are  no 

observations  within  the  interval  (rj,r,.)  (w.p.l),  thus  F{t)  (or  S{t))  is  not  estimatable  for 
t  e  {Ti,Tr).  Denote  A  the  closure  of  a  set  A.  We  will  study  the  consistency  of  Fi{t)  (or 
Si{t))  for  t  G  0  or  on  its  boundary. 

3.  Consistency  of  Si{t)  when  G  is  discrete.  One  of  our  main  results  is 

(1)  lim  sup  \Si{t)  -  S'(i)|  =  0  a.s  and  (2)  lim  sup  \Fi{t)  -  F{t)\  =  0  a.s.,  (3.1) 

n-&oo  n-¥oo 

for  any  F  and  G.  For  a  better  presentation,  we  first  prove  (3.1)  when  G  is  discrete.  The 
proof  is  very  typical  in  terms  of  the  technique  used  in  this  paper  and  is  easy  to  follow. 
The  proof  of  the  main  result  is  similar  to  this  proof  with  modification  to  deal  with  the 
complexity  arising  firom  relaxing  the  assumption  on  G. 

Theorem  3.1.  Suppose  that  G{l,r)  is  a  discrete  distribution  function.  Then  (3.1) 
holds. 

Proof:  Note  that  (1)  and  (2)  in  (3.1)  are  equivalent.  There  are  two  summations  in 
the  expression  of  Fi{t)  (see  (2.1)).  We  will  derive  their  almost  sure  limits  and  show  that 
their  sum  is  F{t).  It  is  easy  to  show  that  the  first  siunmation 


lim  f  ^  ^  ^  <t<R)}  a.s.  uniformly  for  t  >  0,  (3.2) 

n—^OQ  ^  Tl 

i=l 

since  {R*  <  t}  =  {X  <  t}  \  {L  <  X  <  t  <  R}  (see  (1.1).  Denote  Qn{t)  the  second 
summation  in  expression  (2.1),  i.e.. 


n  (3-3) 

.7=1  l<fc<i  It  \  3J  i=l 


j=i  i<fe<i 
It  follows  from  (3.2)  and  (3.3)  that 

lim  sup  \Fi{t)  -  F{t)\  =  lim  sup  \Qn{t)  -  P{L  <  X  <t  <  i?}] 

n-y(xn^Q 


(3.4) 


lim  sup|(5n(i)  -  e  {h,t]}P{{L,R)  =  {li,ri)}\,  (3.5) 
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where  {lj,rj),  j  =  1, 2, m,  are  all  the  possible  distinct  values  of  V  which  satisfy: 

h  <  '  •  ’  <  Ij  <  t  <  Vj  <  •  • '  <  ri  (note  that  lj,rj  and  m  are  functions  of  t), 

P{X  G  >  0,  i  =  1, m  (m  maybe  0  or  +oo).  (3.6) 

Hereafter,  we  will  show  that  expression  (3.5)  equals  0  a.s.  (i.e.,  (3.1)  holds).  Note  that 
if  m  =  0,  then  F{L  <  t  <  R}  =  0  and  =  0.  Thus  Qn{t)  =  0  =P{L  <  X  <t  <  R}. 
Without  loss  of  generality  (WLOG),  we  can  assume  that  m  >  1  a.s.  for  all  t.  Since  G 
is  discrete,  ties  may  occur  in  censoring  intervals.  Given  t  and  (Ij^Vj)  which  satisfy  (3.6), 
let  =  #{*;  {L*,R*)  =  ilj,rj)}  (the  number  of  ties  at  {lj,rj))  for  j  =  1,  ...,m.  By  an 
induction  argument  on  m,  it  can  be  shown  (see  Lemma  6.1  in  the  Appendix)  that 


m 


Q«(t) 


E{[ 


II  n  ( 

l<k<j 


Nk. 


+  qt 


N], 


1}’ 


(3.7) 


where 

JVt  =#{all  GO’S  in  (=  #{A:;  Xk  G  (!»,*]}  -  #{fe;  Xk  G  ili,t],Ll  <  t  <  Rl)}), 

=  l[iV^  >  0],  (3.8) 

JVi*  =#{all  GO’S  in  {li,ri)}  (=  #{fe;  Afe  G  (lun)}  -  #{fc;  G  iLk,Rk)  ^  ik,ri)}). 

To  derive  the  limit  of  Qn{t),  we  need  to  derive  the  limits  of  the  three  factors  in  (3.7). 
The  three  limits  will  be  given  in  (3.9),  (3.12)  and  (3.13)  below.  Then  we  derive  the  limit 
of  Qn^t)  in  (3.15).  The  derivation  is  as  follows. 

Since  X  and  {L,R}  are  independent,  we  have  F{{L*,R*)  =  {l,r)}  =P{X  G  (l,r)} 
’F{(L,R)  =  (/,r)},  and  uniformly  for  all  possible  Ij  <  Tj , 


lim  ^  =  Urn  *I»'  =  P{js:  e  a.s.. 

n->-oo  n  n^oo  U 


(3.9) 

To  derive  the  limits  of  the  other  factors  in  (3.7),  note  that  the  first  equality  in  (3-8) 
yields 


lim  N‘fjn  >  P{JC  e  (ij,(]}  -  P{Jf  £  (ij,«l}P{i  <  *  <  fl}  a.s. 


(3.10) 


n—^oo 


for  all  j  and  uniformly  for  all  t  E  O.  Since  Ij  <  t  <  rj  by  notation  in  (3.6)  and  t  €.  O,  we 
have  P{L  <  t  <  i?}  <  1.  It  follows  from  (3.6),  (3.10)  and  the  last  inequality  that 


lim  NiJn  >  P{A:  G  -F{L<t<R}]>0  a.s. 


for  all  j  and  for  all  <  G  O.  Consequently,  lim,i_^oo  =  1  a.s..  WLOG,  we  can  assume 
that  =  1.  Then  the  limit  of  the  last  factor  in  (3.7)  is  given  by 

/  limn-,.ooAr>/n 
(=  r: - J  ,  ) 

Ti’-^oo  Njiii 
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i 


P{X  e  fe.t]}  -  P{X  g 

P{Xe(li,rj)}-P(X€(/,;rj),(£,Jl)D(lj,rj)} 

P{X  e  (Ij,  i]}  -  ELi  p{^  e  fe.i].  (^.  fl)  =  (<<■  ’-<)}  -  Er>.i  P{-y  e  ft.  <1.  =  ('<■  >•<)} 

P{Jf  €  (/j.r,)}|l  -  EL,P{(i.fi)  =  (‘i.ri)}] 

P{xe(l,;t]}  Er>,P<X€(U])P{(L,Jl)^(l>,n)}  ^ 

P{X  €  fe.r^)}  P{x  €  ((j.r,)}[l  -  EUP{(i.-K)  =  ft.'-f)}] 


uniformly  for  all  t  E  O,  j  >  1. 

The  limit,  of  the  product  in  the  summand  of  (3.7)  is  given  by 


^k*  +  gfc 


yj  P{X  6  (it,  rt)}[l  -  E^t  P{(j^.  fl)  =  (I/,,  ’■).)}] 

1<4  ^  -  E/.<t  P{(i.fi)  = 


1 


(3.13) 


uniformly  for  all  t  E  O,  J  >  1.  (3.7),  (3.9),  (3.12)  and  (3.13)  yield 


m  2 

^Um  Qn(«)  =  E  ^  (h>rj)}P{(L,.R)  =  (h>rj)}]  •  P{(L,R)  =  (I,,,  r^)} 

J=X 

r  P{X  e  ((,,(]}  ES-i  P{^  ^  (ii,  *l}P{(i.  Ji)  =  ft.ri)}  1  ^ 

■  [p{X  e  P{X  €  (ii,r,)}[l-EiiP{(EJJ)  =  fen)})" 

.^fP{xe(;i,t]}P((L,fi)  =  (tj,n)} 
i-Eft<iP{feJ«)  = 

p{(Z,Ji)  =  (ij,n)}Er>.,P{^  e  (ii,t]}P{(£,ii)  =  (ii,n)}  ^ 

[1  -  Ek,  P{(i,  Jt)  =  -  Eii  P{(i.  fi)  =  ft.  n)}]  J 

uniformly  for  all  t  E  O.  In  view  of  (3.5),  to  prove  the  theorem  it  suffices  to  show  that  the 
last  expression  of  (3.14)  equals  €  (^iii]}P{(-i'i.R)  =  That  is 


_  r  P{X  E  (lj,t]}P{(L,Ji)  — 

P{(L,R)  =  (lj,rj)}Er>jP{^  g  (f,,f]}P{(L,i^)  =  (h,n)}  ^ 

[1  -  E^<i  P{(i,«)  =  ('».  ’•<.)}1[1  -  EL  P{(i. «)  =  ('i.n)})  J 

m 

-^P{X€  (ii,t]}P{(I,,iS)  =  ft,n)} 

i=l 


k=l 


P{(L,Ji)  =  (ij.rj)}P{X  €  (k,t]}P{(L,Jt)  =  (it.rt)} 


il;,  41  -  E.<,  P{(i.l5)  =  ('/.,r»)}l[l  -  ELi  P{(i.H)  =  ft.n)}) 


]}. 


(3.15) 

(3.16) 
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which  is  proved  in  Lemma  6.2  (in  the  Appendix)  by  showing  that  each  summand  in  ex¬ 
pression  (3.16)  equals  0.  This  completes  the  proof  of  the  theorem.  □ 

4.  Main  results.  In  this  section,  we  will  prove  (3.1)  assuming  F  and  G  are  arbitrary. 
We  will  also  investigate  the  consistency  of  Si{t)  (or  F{t))  on  the  boundary  of  O.  We  first 
establish  Lemma  4.1,  which  reduces  the  uniformly  almost  sure  convergency  to  the  point- 
wise  almost  sure  convergency. 

Lemma  4.1.  Suppose  that  {Fn}n>i  *5  a  sequence  of  monotone  functions  on  an 
interval  [a,  b)  and  F{t)  is  a  hounded  monotone  and  right  continuous  function  on  the  interval 
[a,b).  If 


lim  Fnit)  =  F{t)  V  t  e  [o,  b)  and  lim  Fn{t-)  =  F{t-)  V  t  €  (o,  6], 

n—yoo  Ti—^oo 

where  F{t-)  =  limsti  F{s),  then  lim„_^oo  sup^gf^^b)  \Fn{t)  -  F{t)\  =  0. 

If  F  is  continuous  then  the  lemma  is  a  well  known  result.  The  proof  of  the  lemma  is 
very  similar  to  the  proof  for  a  continuous  F  and  is  put  in  the  Appendix. 

It  is  easy  to  verify  that  Si{t)  is  a  monotone  function  of  t,  in  view  of  Lemma  4.1,  to 
prove  (3.1),  it  suffices  to  show  that 

(SI)  lim  Si(t)  =  S{t)  and  (S2)  lim  5/(t-)  =  S{t-),  (4.1) 

n-s-oo  n-¥oo 

for  each  teO,  (SI)  holds  for  <  =  and  (S2)  holds  for  t  =  ri.  To  avoid  the  complexity  on 
the  boundary  of  O,  we  first  treat  the  case  that  i  G  d  in  Theorem  4.1.  On  the  boundary  of 
C>,  depending  on  whether  F  or  G  is  continuous  on  the  boundary,  there  is  some  modification 
on  the  proofs,  thus  we  give  the  proof  for  each  case  separately  (in  Lemmas  4.3,  4.4  and  4.5). 
However  the  main  idea  is  similar  to  the  proof  of  Theorem  3.1. 

Theorem  4.1.  For  any  arbitrary  F  and  G,  and  for  any  t  E  O,  (4.1)  holds. 

Proof:  We  first  show  equality  (SI)  by  mimicing  the  proof  of  Theorem  3.1.  Note  that 
in  the  latter  proof,  the  arguments  up  to  (3.4)  hold  for  any  F  and  G.  Hence,  (3.4)  implies 
that  it  suffices  to  show  that 


lim  Qn(t)  =  <  X  <t  <  i?}. 

n^oo 

or  equivalently,  to  show  that  for  any  large  positive  integer  m,  we  have 

iim  {Quit)  -P{L<X  <t<R}]  <  G(1/^M)  a.s.  (4.2) 

iim  I  -  Qn{t)  -I-  P{L  <  A  <  t  <  F}}  <  G(l/^/m)  a.s..  (4.3) 

WLOG,  we  can  assume  that  (— oo,0)  G  V.  Denote  B  =  U(j_r)ev(^’“)  and  its 
complement  set.  If  t  G  B^,  then  P{L  <  t  <  R}  =  0  and  thus  Qn(i)=0  since  =  0 
for  all  k  (see  (2.2)  and  (3.3)).  It  follows  that  P{L  <X<t<R}  =  0  =  Gn(t),  which 
implies  (4.2)  and  (4.3).  Thus  we  assume  t  e  B.  To  show  (4.2)  and  (4.3),  we  will  mimic  the 


8 
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arguments  from  (3.5)  through  (3.14)  in  the  proof  of  Theorem  3.1.  However,  there  may  be 
uncountably  many  elements  in  V  such  that  (3.6)  holds.  Thus  we  modify  (3.6)  as  follows: 

Given  t  e  B  and  given  a  large  positive  integer  m,  we  can  find  a  finite  partition,  {Cm}, 
of  the  set  {(^,r)  e  V;  I  <t  <r}  such  that  Cm  satisfies: 


(1)  Cm  =  {Df,  «  =  1, ...,  mt},  where  mt  (<  m), 

i-ai,li],re  [ri,ri  +  bi) 

(or  r  =  ri) 


D. 


ifai,6i>0  1 

if  tti  =  0,  (or  bi  =  0)  i  ’ 


(2)  /l  —  fll  ^  ^1  ^  ^  ^mt  ~  —  ^mt  ^  ^  ^  '^mt  —  ‘^mt  “t"  ^mt 

(3)  Imt  =  sup{/;  {l,r)  eV,l  <t  <  r,  P{X  6  {I,  t]}  >  0}, 

(4)  rm,  =  inf{r;  {l,r)  eV,l  <t  <  r,  F{X  €  (1,  t]}  >  0},  (4.4) 

(5)  G{li-,  +oo)  -  G{li  -  tti,  +oo)  <  4/m,  F{li-)  -  F{li  -  a^)  <  4/m, 

(6)  G(+oo,  n  +  bi-)  -  G{+oo,  n)  <  4/m,  F{ri  +  bi-)  -  F{ri)  <  A/m, 

(7)  F{iL,R)euTJiDi}  =  F{L<t<R}, 


where  Iq  =  -oo,  and  tq  =  oo.  WLOG,  we  can  assume  that  ai  =  k-h-i  and  bi  =  ri_i  -r^, 
i  =  l, ...,  mt-  Then  A,  (5)  and  (6)  in  (4.4)  reduce  to 


Di  —  {Y I  L  G  (Ij—i, li\,  R  G  [rj,  rj—i)}, 

(5')  G{li-,+oo)  -  G{li-i,+oo)  <  A/m,  F{li-)  -  F{li-i)  <  A/m, 

(6')  G(+oo,ri_i-)  -  G(+oo,ri)  <  4/m,  F(ri_i-)  -  Fin)  <  A/m,  1  <  z  <  m*. 


In  the  proof  of  (4.2),  we  need  a  condition:  t  G  Am,  where 
Am  =  {te  B]  P{X  G  [Imtirmt]}  >  Vv^}- 
However,  it  can  be  shown  (see  Lemma  6.3)  that 

(S3)  if  (4.2)  and  (4.3)  hold  for  all  t  G  Am,  then  (4.2)  and  (4.3)  hold  for  t  eB\Am. 
Thus,  WLOG,  we  can  assume  that  t  G  Am-  It  follows  from  (2)  in  (4.4)  that 

F{X  G  [li,ri\}  >  F{X  G  [lmt,rmt]}  >  l/\An  for  i  =  l,...,mt.  (4.5) 

To  establish  an  expression  for  Qn{t)  corresponding  to  (3.7),  we  further  denote 


r  =  {L,R},  Y*  =  {L*,R*},  Vi  =  {li,ri}, 

Di  =  {Y-,  Le  ih-uh]  Sz  Re  (ri,ri_i);  or  L  G  {li-i,li)  k  Re  [ri,ri_i)}, 
iVi*-  #{CO’s  in  [/i,ri]},  =  #{CO’s  in  [/i,t]},  =  l[Nl_  >  0], 

for  aU  possible  i.  Given  t  G  [a,  b],  for  z  =  1, ...,  mt,  let 

Qi  =  #0;  Y;  e  Di},  qi  =  #{i;  Y;  e  D^}  (qi  =qi-  q^), 

Qt  =  #0’5  =  -^1  =  or  ri},  Ai  =  #{CO’s  in  (/*_!,  li)  or  in  (n,/-*-!)}. 

Using  the  same  idea  as  in  proving  (3.7),  it  can  be  shown  that 


i:{f  n(^ 


Xi*- +  qi  +Ai,^Pi_^{t)  -TT  /iVi*  +  gj  +  Aisi3i^{t) 

11  ^  Ni,  +  Ai  > 


h=\ 


A 

n 


l<i<h 

Ni, 


+  Ai 


if- 


K- 


l<i<h 


n  ( 

l<i<h 


+  Qi  +  Ai-.i3i-» 


iVj*_  +  Ai 


(t)  jQ-  +  Ai^/3j,(t)  Nl 

“1“  Ai 


l<i<h 


+  Ah 

}  <  Qnit)(A.6) 


h* 
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t 


and 


h=i  ^  ^ 


l<i<h 

Qh 


Ni,- 


^iVi*  +  Qj  +  Ah 

l<i<h 


w  +  Aft 


+-  n  ( 


l<i<h 


Ni 


i*- 


' \ /3t.  (*) 


^i*- +  Qi  \Pi-4t)  TT  f^i*+<li\l3i»{t)  ^'^h*  \  rA'7\ 

^  li  ^  Ni,  ^  (iVft*)^-wr 


l<i<h 


The  proof  is  relegated  to  the  Appendix  (see  Lemma  6.4).  We  will  use  (4.7)  to  prove  (4.2) 
and  use  (4.6)  to  prove  (4.3). 

We  first  show  (4.2).  By  reasoning  similar  to  that  before  equation  (3.12),  WLOG, 
we  can  assume  /5ft*  (i)  =  /5ft*_(t)  =  1  and  thus  we  omit  the  exponents  /5ft*  (t)  etc.  in  the 
expressions  of  (4.7).  Note  that  if  G  is  discrete  and  only  takes  values  at  {k,  Vi),  i  =  1, ...,  rrit, 
all  the  ratio  indexed  with  vanish  and  the  expression  on  the  right  hand  side  of  (4.7) 
reduces  to  (3.7).  On  the  other  hand,  if  G  is  continuous,  then  5°  =  0  and  Qh  =  Qh  w.p.l, 
and  (4.7)  reduces  to 


mt 

Qn{t)  <  ^{(“)(  n 

h=l 


Kidh 


Qi\  \  1 

iVi*_  ^^i\rft*_  + Aft^J 


(4.8) 


If  G  is  neither  continuous  nor  discrete,  the  proof  is  similar  to  that  for  the  continuous  case. 
For  ease  in  understanding,  we  will  assume  that  the  joint  distribution  function  G{1,  r)  is 
continuous.  Thus  we  will  prove  (4.8),  instead  of  (4.7). 

We  now  derive  the  limits  for  the  three  factors  in  the  summation  of  (4.8).  The  following 
argument  is  parallel  to  (3.9)  through  (3.16)  in  the  proof  of  Theorem  3.1.  The  limit  of  Qh/n 
is  given  by  lim«_^oo  ^  G  L>ft}  a.s.  (see  (3.9))  and 

P{y*  G  T>ft}  =  F{X  G  [/ft,  rft]}P{F  G  L>ft}  +  P{X  G  (L,  /ft)  U  (rft,  R),Y  G  L>ft} 

=  P{X  G  [/ft,  rft]}P{y  G  £>ft}  +  G(l/m2).  (4.9) 


Thus  the  limit  of  the  first  factor  in  (4.8)  is 


lim  ^  =  P{X  G  [/ft,rft]}P{F  G  Dh}  +  0{l/m‘^)  a.s.. 

n-^oo  n 

Since  G  is  continuous,  P{(L,  i?)  D  [l,r]}  =P{(L,  i?)  D  (/,r)}.  Then 

,.  5Vj*_  ~\'  Qi  ^  1.  /i  ,  Qi/'^ 

hm  — -  (=  hm  (1  +  — - j-)) 

n-^oo  Niif—  n-^oo 


--1  + 
=1  + 


P{Y*  G  Pi} 

P{CO’s  in  [li,ri]} 

P{X  G  [li,ri]}P{Y  G  Di}  +  0(l/m2) 


(see  (4.9)) 
2'! 


P{X  G  [li,n]}  -  P{X  G  {L,R)  D  [h,ri]} 

P{X  G  [/i,  r,]}[l  -  E,<i  ny  ^Dj}  +  P{y  e  A}]  +  0(l/m^) 


ll-Ei<iP{y€Di}] 


a.s.  (due  to  (4.5)). 


(4.10) 


(4.11) 
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As  a  consequence,  the  second  factor  in  (4.8)  satisfies: 


K  n 


n—¥oo 


^  ni 

l<i<h 


^j<i  ‘ 


l<i<h 


< 


1  + 

DA 

2<i<h-l 


<■ 


■  1  -  E,<ft-l  P{P  ^DiV  1  -  Ei<m.  P^y  ^ 

Note  that  the  summation  in  the  denominator  of  the  last  expression 


mt 


(4.12) 


^  P{y  e  £>,}  =  p{(t.fl)  3  (im.-i.r-m.-i)}  <  P{L  <  (  <  11}  <  1  (see  (2.2)). 

j<mt 

since  Imt-i  <  t  <  t  ^  O,  which  is  an  open  set.  Denote  do  =  1  -  ^{L  <  t  <  R}, 

which  is  independent  of  the  integer  m  and  is  >  0.  It  follows  from  (4.12)  that  the  second 
factor  in  (4.8)  satisfies: 


V —  TT  _ t _ 


n—^oo 


l<i<h 


< 


l-Ei<s-iP{5'€B,} 

Note  that  by  the  assumption  on  Cm  we  have 


dp  +  Oim 
^  do  ■' 

[l  4-  0(m“^/^/do)]  a.s..  (4.13) 


lim  I  —  I  =P{X  e  {Ih-i,  Ih)  U  (vh,  Th-i)}  -  P{X  e  {Ih-i,  Ih)  U  (vh,  Vh-i),  (L*,  R*)  D  ih,  Th)} 

n-^oo  n 

<P{X  e  {lh-i,lh)  U  irh,rh-i)}[l  -  P{iL,R)  D  (Ih-urh-i)}] 

<(8/m)do  (by  (5')  and  (6')  in  (4.4)  and  by  (4.12)). 

Hence,  the  last  ratio  in  (4.8)  satisfies 

..  K.-  +  ,.  K.-  +  Op/m)) 

lim  - T—  <  hm  - — - 

n-^oo  Nh*-  +  Aft  n-foo  Aft*_ 

[P{X  e  [lh,t]}  -  P{X  €  [/ft,t],L*  <  t  <  R*}]  +  Q(l/m) 

P{X  G  [/ft,  rft]}  -  P{A:  g  [/ft,  rft],  (L,  R)  D  [/ft,  rft]} 

_P{X  G  [/ft,/]}  -  Zti nx  €  Kt],  {l,r)  g  Pi} 

P{Xe[lh,rh]}[l-ZtiHYeDi}] 
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(  due  to  (4.5)) 
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-j:T^,HX€(L,t],YeDi}  1 

PiXslfk.rJXl-ELiPiyeA}]  *VSi 
P{X  £  [it,i]}  -  sL.  P(^  e  M}ny  e  Dj} 

P{Xeli»,rJ}[l-Ef,,P{y€  A}] 

-  ESV nx  e  [4, t]}P{y  e  A}  +  0(l/m)[l  -  E«.i  P{y  e  A}]  ^  1  , 

nx  e  [ii,  rj}[i  -  Eti  P{P  e  A}] 

P{Xe[i,.t)}  Er>>P{^£fe.«l}P{yeA}  1  ,  . 

P{X  €  Ik,  rj>  P{X  €  [ift,  rj}[l  -  E?=i  P{i"  e  ft}] 

(4.10),  (4.13)  and  (4.14)  give  the  limits  of  the  three  factors  in  (4.8),  thus 

lim  Qn{t)  <^|[P{X  G  [/h,rft]}P{F  e  Dh}  +  0{l/m  )]  ^  _  ^ 

.  I ^  t^*’*]} _ 6  [ii,t]}p{y  e  Pi] —  ^  c)(i/(v/S)]) 

^P{xe[!^,rkl}  P{;f  e[(,.,rd}[i-EtiP{Peft}l  ’ 

■^fP(xe[ift,t]}P{r6ft}  P{P6gt}Er>‘>P{^sfe,ti}P{y€Pi}  ^ 

i-Ei<(.P{i'^ft}  [i-Ej<»P{P€i5,}|li-ELiP{Peft}]^ 

+  0(m”^/^)  a.s..  (4-15) 

Moreover 

mt 

V{L<X<t<R}^Y^V{L<X<t,YeDh} 

/l=l 

mt 

=  Y^[P{lh  <X<t}  +  0{l/m)]P{Y  G  Dh] 

h==l 

mt 

=  ^P{X  G  [lh,t]}P{Y  G  Dh}  +  Oil/m).  (4.16) 

/i=i 

Thus  it  follows  from  (4.15)  and  (4.16)  that 
^  {Qn{t)  -P{L<X  <t<R}} 

n—^oo 

^  fP{X€[it,i]}p{rei)ft}  p{ygOh}Eg.VP{^gft.t]}P{^6ft}  ] 

mt 

-  P{A:  €  [it,  J]}P{y  €  £>a}  +  0(m-‘/")  a.s..  (4.17) 

h=l 

Notice  that  expression  (4.17)  is  identical  to  expression  (3.15)  except  that  mt  is  replaced 
by  m,  F  G  Dh  by  (L,  R)  =  {h,  Vh),  [h,  t]  by  (/*,  t]  and  0(l/m“i/2)  i^y  q 

mt 

{Qn{t)  -P{L<X  <t<R}}  <J2sh  +  0(m-i/2)  a.s., 

n—^oQ 
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where  Sh  is  the  same  as  the  summand  in  (3.16),  except  that  {L,R)  =  {IhiTh)  is  replaced 
hyYeDh  and  {k^t]  is  replaced  by  [li,t].  The  same  argument  as  in  the  proof  of  Lemma 
6.2  yields  Y!!h=i  =  0.  (4.2)  then  follows. 

With  the  same  idea,  we  can  show  that  (4.3)  holds.  This  completes  the  proof  of  (SI). 
In  a  similar  manner,  we  can  show  (S2).  This  completes  the  proof.  □ 

If  Ti  =  +0O,  Theorem  4.1  yields  (3.1).  If  n  <  oo,  in  order  to  show  (3.1),  we  need  to 
further  verify  (S2)  of  (4.1)  for  t  and  (SI)  of  (4.1)  for  t  =  r^.  It  can  be  verged  that 
the  proof  of  Theorem  4.1  can  be  applied  to  all  t  eO,  provided  that  for  each  t  eO, 

P{(L,R)D[lm,.rm,]}  i=l-do)  <1.  (4.18) 

(4.18)  is  needed  in  (4.13)  etc..  However,  (4.18)  may  not  hold  on  the  boundary  of  O. 

To  show  (4.1)  on  the  boundary  of  O,  we  first  establish  three  lemmas. 

Lemma  4.2.  For  any  arbitrary  F  and  G, 

liin  Si{ti)  <  lim  Si{ti-)  <  S'(rj-)  a.s.  and  lim  5j(r^)  =  hm  Si{Tr+)  >  S(Tr)  a.s.. 

n—^oo  n— ^oo  n—^oo  n-^oo 


It  is  worth  noting  that  due  to  our  convention  Si{t)  is  right  continuous,  thus  Si{Tr)  = 
Si(Tr+)  in  the  lemma.  The  proof  of  the  first  inequality  is  identical  to  Yu  and  Li  (1994, 
Lemma  2).  The  proof  for  the  second  inequality  is  similar  to  that  for  the  first  one. 
Lemma  4.3.  For  any  arbitrary  F  and  G, 

(1)  if  F{ti-)  =  1,  then  (S2)  of  (4-1)  holds  for  t  =  n; 

(2)  if  F{Tr)  =  0,  then  (SI)  of  (4-i)  holds  for  t  =  r^. 

The  proof  of  the  first  inequality  is  the  same  as  Yu  and  Li  (1994,  Lemma  4).  The  proof 
for  the  second  inequality  is  similar  to  that  for  the  first  one. 

Lemma  4.4.  For  any  F  and  G, 

(1 )  if  P{L  =  Ti}>  0  then  (S2)  holds  for  t  —  n; 

(2)  if  P{R  =  Tr}  >  0  then  (SI)  holds  for  t  —  Tr- 

Proof:  We  first  prove  statement  (1).  Note  that  if  t  =  r;  and  P{L  =  r/}  >  0,  then 
do  >  0  (see  (4.18))  and  P{(L,  R)  D  {k,  r^)}  <1- do  <l  iox  t  =  n.  It  can  be  checked  that 
in  the  proof  of  Theorem  4.1  all  the  statements  from  (4.6)  through  the  end  of  the  proof 
hold  for  t  =  Tj.  This  completes  the  proof  of  Statement  (1).  Statement  (2)  can  be  proved 
similarly.  □ 

Lemma  4.5.  For  any  arbitrary  F  and  G,  (S2)  in  (4'i)  holds  fort  =  rj  and  (Si)  in 
(4-1)  holds  for  t  Tr. 

Note  that  Lemmas  (4.3)  and  (4.4)  are  special  cases  of  the  theorem.  The  proof  of  the 
theorem  is  relegated  to  the  Appendix. 

Remark  4.1.  When  G{ti-,  +oo)  =  1,  if  t  then  we  would  never  observe  any  exact 
observation  at  r;  (w.p.l).  Thus  due  to  the  convention  in  section  2,  we  have  Si{ti-)  = 
Si(ti)  =  Si(L^^^).  In  particular,  if  G{t,  +oo)  is  continuous  at  t  but  F{t)  is  not,  then 

lim  Si{ti)  =  lim  Si{ti-)  =  S{ti-)  >  S{ti)  a.s.. 

n-^oo  n— »'Oo 
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However,  due  to  the  convention  on  Sj{t),  Si{t)  is  right  continuous  at  and  so  is  S{t). 
Thus,  Si{Tr)  always  converges  to  S{Tr)  a.s.. 

In  view  of  Remark  4.1,  it  is  easy  to  derive  the  following  result: 

Lemma  4.6.  If  F{t)  is  continuous  at  ti,  then  limji.^00  sup^g^ \Si(t)  -  5^(t)|  =  0  a.s.. 
It  follows  from  Lemmas  4.1,  4.3,  4.4,  4.5  and  4.6  that  the  following  result  holds. 
Theorem  4.2.  Let  O*  =  OU  {r^}.  Under  the  DI  Model,  for  any  arbitrary  F  and  G, 

lim  sup  \Si{t)  —  5(t)|  =  0  a.s.; 

lim  sup  \Si(t)  -  S{t)  \  —  0  a.s.  unless  F{ti-)  <  F(ti)  and  G{ri-,  +00)  =  1. 


5.  Discussion.  A  direct  consequence  of  Theorem  4.2  is  the  following  result  related 
to  the  PLE  with  left-censored  data. 

Corollary  1.  For  any  arbitrary  F  and  G,  the  PLE  SpL{t)  with  left-censored  data 
satisfies  lim,i_^oo  supi>^^  \SpL{t)  -  5(i)|  =  0  a.s.. 

It  is  interesting  to  see  that  the  PLE  with  left-censored  data  does  not  carry  over  the 
short-coming  of  the  PLE  with  right-censored  data  at  rj.  As  implied  by  Theorem  4.2, 

lim  sup  \SpL{t)  —  S{t)\  =  0  a.s. 


failed  for  arbitrary  F  and  G  with  right-censored  data.  The  short-coming  does  not  depend 
on  the  definition  of  Si{t)  for  t  >  L’^^y 

With  right-censored  data,  it  has  been  proved  that 

For  any  arbitrary  F  and  G,  lim  sup  |5c(t)  —  <S'(t)|  =  0  a.s. 

—  (n) 


(see  Yu  and  Li  (1994)).  The  corresponding  statements  with  interval-censored  data  are 

lim  sup  \Sc{t)  —  'S'(t)|  =0  a.s.,  (5.1) 

mini  or  t<L*  . 

»  —  —  (n) 

which  follow  from  Theorem  4.2. 

Finally,  it  is  desirable  to  derive  the  almost  sure  limit  of  Sjit)  over  the  entire  region 

{0  if  r;  =  0  and  <  00 

if  0  <  Ti  and  Tr  <  -hoo  It  is  easy  to  derive  the  following 
-foo  if  0  <  T;  and  =  -f 00. 

result. 

Remark  5.1.  lim„_>,oo supjy<^<.r^  \Fcit)  —  'S^('rr)|  =  0  a-S-  and 

I  limn-^00  sup.^,<^<J^^  \Scit)  -  S{ti-)\  =  0  if  S{ti-)  >  S{ti)  and  G{ti-,  +00)  =  1; 

\  lim„^oo  sup^,<i<M  \Sc{t)  -  S{ti) I  =  0  otherwise 
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It  is  well  known  that  the  GMLE  is  not  uniquely  defined  in  empty  intervals.  However 
the  uniformly  strong  consistency  of  the  GMLE  does  depend  on  the  definition  over  empty 
intervals  (see  Yu  and  Li  (1994)). 

The  natural  extension  of  Si{t)  from  expression  (2.1)  assigns  weight  only  to  i?|.  How¬ 
ever,  this  convention  will  affect  both  (3.1)  and  Theorem  4.2.  Finally,  we  point  out  that 
(5.1)  does  not  depend  on  the  definition  of  Sr{t)  over  empty  intervals. 
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Appendix 

We  give  the  proofs  of  lemmas  in  this  section.  This  section  could  be  deleted  and  put 
in  a  technical  report  in  a  future  revision. 

Proof  of  Lemma  4.1:  Since  F  is  bounded,  monotone  and  right  continuous  on  [a,  b), 
for  any  e  >  0,  there  exist  finitely  many  points,  ti,...,tfe  G  [a,b)  such  that  a  =  ti  <  t2  < 
■■■  <tk,  \F{ti+i-)  -  T’(ti)|  <  e,  for  i  =  1, ..., k,  where  tk+i  =  b.  Since  Fnit)  converges  to 
F{t)  for  all  t  =  ti,i  =  l, ...,  k,  and  for  all  t  =  k-,  i  =  2, ...,  k  +  1,  there  exists  an  no  such 
that  whenever  n  >  no  we  have  \Fn{t)  —  jP(t)|  <  e,  for  all  t  =  ti,  i  —  l,...,k,  and  for  all 
t  =  ti-,  i  =  2, ...,  k  +  1.  For  any  t  G  [a,  b),  there  exists  some  i  such  that  t  G  [U,  U+i),  then 

<max{lFn(ti)  -  F’(ti+i-)|,  \Fn{ti+i-)  -  F(tj)|} 

<  max{|F„(ti)  -  F{ti)  +  F{ti)  -  F(ti+i-)\,  \Fn{ti+i-)  -  F{ti+i-)  +  F{ti+i-)  -  F(ti)|} 
<2e, 

whenever  n  >  no-  Since  t  and  e  are  arbitrary,  the  lemma  follows.  □ 

Proof  of  Lemma  4.5:  WLOG,  we  can  assume  that  0  <  t/  <  <  +oo.  It  follows 

from  Lemmas  4.1,  4.3  and  4.4  that  it  suffices  to  show  that 

lim  \Si{ti-)  -  S{ti-)\  =  0  a.s.,  if  F{ti-)  <  1  and  G(ri-,  +oo)  =  1;  (6.2) 

n—^oo 

lim  |5/(rr+)  -  S{Tr)  \  -  0  a.s.,  if  F{Tr)  >  0  and  C?(+oo,  r^)  =  0.  (6.3) 

n—^oo 

We  first  show  (6.2).  Notice  that  Um  <  S{ti—)  a.s.  by  Lemma  4.2.  Then  (2.1), 

n->oo 

(3.2)  and  (3.3)  yield 

lim  Qnin-)  >  P{L  <  X  <Ti  <  R}.  (6.4) 

n—^oo 

Thus  it  suffices  to  show  that 

iim  Qnin-)  <  P{L  <X  <Ti<R}  if  F{ti-)  <  1  and  G{ti-,  +oo)  =  1.  (6.5) 

n—^oo 

Hereafter,  we  assume  F{ti—)  <  1,  G(tj— ,+oo)  =  1  and  t  =  ti—  (see  (6.2)),  then  it 
yields  Imt  —  ti  (see  (4.4))  and  Thus  do  =  1  -  F{{L,  R)  D  {Imt.fmt)}  =  0  and  the 

proof  of  Theorem  4.1  is  not  applicable  directly,  since  it  needs  do  >  0  (see  (4.18)  or  (4.13)). 
To  mimic  the  proof  of  Theorem  3.1  or  4.1,  we  modify  (4.7)  as  follows: 

Qn(t) 


2  E  n  n  ( 


Ni^.-+qi  >.  TT  (Xi^f  +  q^^  +  Afc 


h=\  \<i<h 


l<i<h 


n  ( 


iVi*-  +  qi 


l<i<h 


\  TT  I 


+—  n  ( 

l<z<mt 

(+%  n  ( 


+  qi 


)  n 


+  gj  N  TT  -^mt* 

M.  '  11  ^  M.  hhf  \Pm, 


l<i<7nt 


Ni.  ^ 


(which  equals  0)). 
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(6.6)  can  be  proved  in  a  similar  manner  as  in  deriving  (4.7).  The  only  difference  between 

(4.7)  and  (6.6)  is  in  the  third  summand  (indexed  by  m*).  Note  that  when  t  =  n-,  =  0 

(see  (3.8))  and  thus  the  fourth  expression  in  (6.6)  equals  0. 

WLOG,  we  can  assume  that  S{ti-)  -  S{Tr)  =  ao  >  0.  Otherwise,  since  n  <  Tr  and 
in  view  of  Lemma  4.2,  we  have 

Si{ti-)  <  S{ti-)  =  S{Tr)  <  Si{Tr)  <  Si{ti-)  a.S.. 
i.e.,  (6.2)  and  (6.3)  hold. 

Let  the  notation  be  the  same  as  in  the  proof  of  Theorem  4.1,  it  can  be  verified  that 
the  arguments  between  (4.6)  and  (4.17)  remain  true  except  the  following:  P{X  G  [k,  n]}  > 
ao  >  0  hv  t  =  n-,  thus  it  does  not  need  the  restriction  (4.5).  In  order  to  show  (6.5)  or 
equivalently  (4.2),  it  suffices  to  show  that  inequalities  (4.13)  and  (4.14)  holds. 

To  this  end,  as  argued  in  the  paragraph  after  (4.8),  WLOG,  we  can  assume  that  G  is 
continuous,  and  we  further  assume  PlF  €  A}  =  *  =  1)  (6.6)  reduces  to 


mt—1 


h=l 
Qm 

n 


n  ( 


Nj*-  +  Qi' 

Ni.- 

iV,:*- 


Nl^_  +  Ah 


QM  2  t  n  ( + Aft 

l<i<h 


(6.7) 


f  N*  1 


Since  P{X  G  [li,ri]}  >  oq  >  0  for  all  i  if  /*  <  r;  <  n,  in  a  similar  manner  as  in  deriving 
(4.13),  we  can  show  that  for  t  =  t/— , 


lim 


n 

l<i<h 


“h  % \ 

V  yvr  ) 


l+0(l/m^)  -r-T  ~  ^j<i  m  0(1/^  ) 

- 1  -  €  Dj}  1  -  E,<.-x 


< 


1  - 

m 


a.S.,  h  <  mt- 


(It  is  worth  noting  that  0(l/rn^)  in  the  above  expression  is  due  to  P{X  G  [/*,  Vi]}  >  oq  >  0 
and  0(l/m^/^)  in  (4.13)  is  due  to  P{X  G  [li,ri]}  >  l/(m^/^)}.)  Since 


m— 1 


2=2 


m 


<  jj 

2<i<y/m 


m—1 

n 


#  +  0(l/m^) 


<(l  +  0(l/m))'^- (l +0(l/(m’^^)))’" 
<l+0(l/m*/2), 


(6.9) 
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1 


lim  n  (%^)  =  T- 


(H-0(m 


This  is  inequality  (4.13)  for  t  =  n—. 

It  can  be  verified  that  the  argument  in  proving  (4.14)  holds  for  h  =  1, ...,  mt  - 1,  which 
is  the  upper  bound  for  the  limit  of  the  third  factor  in  expression  (6.7).  We  need  to  show  a 
similar  inequality  corresponding  to  (4.14)  holds  for  h  =  mt.  The  third  factor  in  expression 
(6.8)  satisfies 


JVJl 


•  ie(/  ,T,)r6{rr  .)  (/  rl  €  vl 

-  “"P I  P{X  6  [/,  r]}  -  P{X  €  ll,rl  (L%  R>)  D  [i,  r]}’  '  ^  ^ 

P{X  €  [l,r]}-P{X  e  [IML,R)  D  [1,71}  ’  ^  ^  ^  (r.,r^*-i),(/,r)  G  V| 

-°“P{p{X€[l’f}’  '  ^  ('“‘-I' ^  ^ 

<0(1/ {mao))  (due  to  (5')  in  (4.4)) 

^  ^  (since 

P{Xe[r,,T.]}  P{Jfsh,T,l}[l-ESiP{P6®<}l 

This  is  the  inequality  corresponding  to  (4.14)  for  h  =  mf  Thus  the  arguments  between 
(4.13)  and  (4.17)  hold.  As  a  consequence  (6.5)  holds.  It  follows  from  (6.4),  (6.5)  and  (3.4) 
that  (6.2)  holds. 

(6.3)  can  be  shown  using  the  same  idea.  This  completes  the  proof  of  the  lemma.  □ 
Lemma  6.1.  Using  the  notation  as  in  the  proof  of  Theorem  3.1,  (3.7)  holds. 

Proof.  We  will  prove  the  lemma  by  induction  on  m  (see  (3.7)). 
m  =  1.  By  the  definition  of  m,  there  is  only  one  (/i,ri)  in  V  such  that  t  €  (/i,ri). 
Suppose  that  there  are  qi  ties,  say,  (Ll,Rl)  =  ■■■  =  {L*q^,R*qJ  =  ih,ri).  It  follows  from 
the  convention  on  the  ties  (see  Remark  2.2)  that 

1.  Nj  =  Nu  +  qi-  h  j  <  since  {L- ,  is  a  CO  of  (L?_i,  R*_i)  for  i  =  2, ..., 

2.  Nl  =  •••  =  =  Nl^,  since  U-  <t  <  R*,  i  <  qi; 

3.  Nj  =  I3i(t)  =  0  for  i  =  gi  +  1, ...,  n. 

If  iVf*  >  0  (and  thus  Pi(t)  =  1,  i  <  qi),  then  (3.3)  yields 


3 


i=i 

<11 


'^<^<3 


Nk  ^(Nj)^^(^^) 


:)] 


Nf 


1* 


Nu  +  qi  —  h  ^  Nu  +  qi  ~  j 


=E^[  n(i+ 

j—l  l<k<j 
0 

=~fT;^]  (which  can  be  proved  by  induction  on  qi) 
n  Ai* 

=1^11  n 

Th 


(6.11) 


l<fc<l 


Nk. 
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That  is,  (3.7)  holds  if  iVf*  >  0.  If  iVf*  =  0,  (3.7)  is  trivially  true.  Thus  (3.7)  holds  for 

m  =  1. 

Now  assume  that  (3.7)  holds  for  m  -  1,  we  will  show  that  (3.7)  holds  for  m.  By 
assumption,  there  are  m  distinct  (/j,rj)’s  satisfying  h  <  •  • '  <  Im  <  t  <  f'm  <  ‘ <  ri. 

WLOG,  let  =  •••  =  •••)  {^qi-\ — i-gm-i+i’ — i-9m-i+ii^ 

•  •  •  =  {L*q^+...+q^ ,  Rl,+...+q^ }  =  {^m,  Tm},  where  Qi  +  ■  ■  •  +  Qm  <  n.  Then 

51 H - h5m  1  1  AT* 

j=l  l<k<j  ''  ^ 

Tfi — 1  I  ^ 

= i:  {i|ii  n  ^  (by  induction  assumption  on  m  —  1) 

j=i  ^  ^ 

qi-\ - \-qm  .  -  T\Tt 


j=Ql~\ - h^m-l  +  l 


WLOG,  we  can  assume  that  iV|*  >  0  (and  thus  pj*{t)  =  1),  then  the  last  summation 
equals 


51 H - h5m  1  AT? 

i=9l+*“+<?m-l+l  l<k<j 

qi-\ - hQm  I  -I  , 

'  ,.ilj  'J 

J=51+— +9m-l+l  l<fc<m  l<s<5fc 

AT  I  ..  - t5m  1 

■[n(^)],  E  I 

l<fc<m  - h^m-l  +  l  ^1-1 \-Qm 


n 

51+-+Q'  m  —  1  +i<fc<j 


=[  n  ( 


Nk*  +  Qk 


Qi-i - \-Qm  ^  _  -1  Art 

__  S  n 

j — — 1  +  1  ^i-j - 1-^  m~  1 


1  -.^1 


(6.13) 


=[  n  (^)]EKn 


l<fc<m 


j=l  l</c<j 


^  d"  5m  J 


^ +  5fe  N-|r5m  ir  -^m 


=[  n  (  (see  the  third  equality  in  (6.11)). 

Then  (6.12)  and  (6.13)  yield  (3.7)  for  m.  This  completes  the  proof  of  Lemma  6.1.  □ 
Lemma  6.2.  Expression  (3.16)  in  Theorem  3.1  equals  0. 

Proof:  Denote  by  jsfcj  expression  (3.16).  To  prove  the  lemma,  it  suffices  to 

show  that  each  summand,  Sfe,  in  summation  (3.16)  vanishes. 

It  is  trivial  when  A:  =  1.  When  k>2,  denoting  Y)]=i  aj  =  0  Hi  <  1, 

s.  T/''"?!  ^  ^  (!t,i]}P{(i,iJ)  =  (h.Tk)} 
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S- 


^  1  -  ELiP{(i-R)  =  ('i.'-f)} 

p{x  6^<t,<l}P{(Efi)  =  ('*’*•*)>  -  P{X  e  (!t,(]}P{(L,JJ)  =  (h,n)} 

r  P|(£,fi)  =  (4-i,rfc-i)}  P{X€fo,i]}P{(L,-R)  =  (it,rt)}i 

^i-Eft<.-iP{(i.'R)  =  fe.'''.»  i-E‘riP{(i.«)  =  ('*.’•<)}  ^ 


Pf(L,fl)  =  (/i,r,)}  P{.y€fa,t]}P((L,i;)=fa,rfc)}l 

^li-E„<jP{(i.i?)  =  i-ELiP{(i.^2)  =  ft.>'i)}  ‘ 

f  i-Efa‘P{(^.«)  =  ('<■’•■)}  P{^efei]}P{(EJ?)  =  (it.>-t)}l 
ii-EK.-iP{(i.'fi)  =  (''>.'■»)}  i-Eti‘P{(i.«)  =  ('i.’'i)}  ‘ 


PUL,R)  =  P{A'€fe,tl}P{(X,-R)  =  (4,rt)}| 

,tl^i-E:ft<jP{(i.«)  =  &.’■».)}  i-EiiP{(i.-R)  =  ('•.’■<)}  > 

P{X  6  fe,t]}PjjL,fi)  =  e  fe,(]}P{(L,iJ)  =  {k,r,)} 

i-Ei.</i-iP{(i.^)  =  (*<»’■'>)) 


P{(L,j;)  =  fe,rj)}  P{X€(4,t]}P((i^,-R)  =  (4,'-|t)}l 

^  11 -Eft<jP{(i.  i-ELiP{(i.  ■«)  =  (*;.'■<)}  J 


It  is  important  to  note  that  the  last  expression  has  the  same  pattern  as  the  first  expression, 
except  that 

A;— 1  ^ 

and  in  the  first  one  are  replaced  by  ^  and  in  the  last  one, 

j=l  h<k  h<k-l 


respectively.  Thus  inductively,  we  can  show  that  Ylh<k  expression 

can  be  replaced  by  X)j=i  Ylh<v 


Sk 


-P{Xe(4,t]}P{(i,B)  =  (is,rii)} 


P(Xgfa,t]}P{(L,fi)  =  (l»,.rt)} 
l-E/,<iP{(i.-R)  = 

P{(L,-R)  =  (i,-,r,)}  P{X  €  (it,t|}P{(L,.R)  =  (it,n,)}l 
-  Ek<,P{(i.lJ)  =  (!/..'■».)}  l-EiiP{(i.lJ)=('i.'-i)}  > 
=P{X  e  (4,il}P{(i,.R)  =  (()..>•*)}  -P{X  €  (h,t]}P{{L,R)  =  (4,1-*)} 

=0.  □ 


-S{rr 


Lemma  6.3.  Statement  (S3)  in  the  proof  of  Theorem  f.l  holds. 
Proof:  Denote  B  \  We  will  show  that 

(S4)  Am  is  a  closed  set. 
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thus,  is  an  open  set,  which  equals  Ui>i(ai,6i),  where  the  (ai^biYs  are  disjoint  open 

intervals  and  ai,  bi  e  {±oo}.  Then  we  will  show  that 

(S5)  P{^  G  iai,bi)}  <  1/\M  for  any  i. 

By  the  condition  in  the  lemma,  (4.2)  and  (4.3)  hold  for  t  G  Am',  moreover,  (4.2) 
and  (4.3)  hold  for  t  e  hy  the  arguments  after  inequality  (4.3);  furthermore,  (4.2) 
and  (4.3)  hold  if  t  =  d=oo.  The  reason  of  the  last  statement  is  as  follows.  If  i  =  ±oo, 
P{L  <  t  <  R}  =  0.  Thus  Nl  =  0  and  Qn{t)  =  0  (see  (2.2)  and  (3.3)).  It  follows  that 
P{L  <  X  <t  <  R}  -  0  =  Qn{t),  which  implies  (4.2)  and  (4.3).  As  a  consequence,  if  i  =  Oj 
or  bi,  (4.2)  and  (4.3)  hold  and 

\Mt)-S{t)\<Oil/V^)  (see  (3.3),  (3.4)  and  (3.5)). 

Since  both  Si{t)  and  S{t)  are  nonincreasing  function  of  t  and  in  view  of  (S5),  it  is  easy 
to  show  that  \Si{t)  -  5(i)|  <  0(l/v^)  for  t  e  (aubi),  which  is  equivalent  to  inequalities 
(4.2)  and  (4.3)  for  t  G  (a^,  bi).  Thus  the  proof  of  the  lemma  will  be  completed  after  we 
show  (S4)  and  (S5). 

To  show  (S4),  let  {tk}  be  a  sequence  of  points  in  Am  which  satisfies  limfe_).oo  tk  =  to- 
We  need  to  show  that  to  G  -^m-  By  the  notation  in  the  proof  of  Theorem  4.1  (see  the 
definition  of  Am),  for  each  tfe,  P{A’  G  [Imk^'^mk]}  ^  where 
Imk  =  sup{/;  I  <tk  <r,  {l,r)  eV,P{l  <  X  <  to}  >  0}, 

Tmk  =  inf{r;  I  <tk  <r,  (/,  r)  G  V,  P{i  <  X  <  to}  >  0}. 

It  can  be  seen  that  the  sequence  of  intervals  {Imk  >  ’"mj  also  satisfy  Condition  DI.  Since  the 
probability  is  bounded  by  1,  there  are  at  most  finitely  many  disjoint  {lmk,fmk)-  WLOG,  we 
can  assume  that  [tmjbjrTOfc]  D  [^mfc+i>’"rnfc+i]i  ^  >  1.  It  follows  that  PIX  G  [^mo^rmo]}  > 
where  [lmo,rmo]  =  (^kilrnky'^mk]  and  it  can  be  verified  that 
Imo  =  sup{/;  I  <  to  <r,  {I,  r)  G  V,  P{/  <  A"  <  to}  >  0}, 

Tmo  =  mf{r;  I  <  to  <r,  (Z,  r)  G  V,  P{Z  <  X  <  to}  >  0}. 

As  a  consequence,  to  G  Am-  Thus,  Am  is  closed  i.e.,  (S4)  holds. 

We  now  show  (S5).  Given  (a^,  h),  it  follows  from  the  definition  of  A^  that  for  each  t  G 
(ttj,  bi),  there  is  an  interval  {lt,ft)  G  V  such  that  P{X  G  {It,  r*)}  <  4/ -Jm  and  It  <t  <  Vt- 
The  collection  {{lt,rt)’,t  G  {ai,bi)}  is  a  cover  of  {ai,bi)  i.e.,  lit{lt,ft)  ^  {(^hh)-  It  follows 
that  either  (1)  there  is  an  {lt,rt)  D  {ai,bi)  or  (2)  there  are  three  points  ti,t2,t3  G  {ai,bi) 
and  two  intervals  {lti,fti)  and  {lt2,'^ti)  in  the  collection  {{lt,rt)',t  G  {ai,bi)}  such  that 
<3  e  (Z<i,r*J  n  (Zt2,rtJ,  ti  ^  {h^^n^),  t2  ^  {hi,rti)  and  h  <  h  <  t2.  However,  case 
(2)  is  impossible  since  {lti,rti)  and  {It2,n2)  violate  Condition  DI  but  they  belong  to  V, 
whose  elements  satisfy  Condition  DI.  It  follows  that  there  exists  at  least  an  {lt,rt)  in  this 
collection  such  that  {luft)  ^  (oijZ'i)-  Then  P{X  G  (ai,6i)}  <  P{X  G  {lt,‘ft)}  < 

Thus  (S5)  holds,  o 

Lemma  6.4  Using  the  notation  in  the  proof  of  Theorem  4-1,  (4-^)  (4-V  hold. 

Proof:  The  proof  of  the  lemma  is  an  analog  of  the  one  for  Lemma  6.1.  In  the  following, 
in  order  to  avoid  exponents  /3j*(Z)  etc.  (see  (4.6))  we  first  assume  that  Nmt*  #  0.  Thus 
ATj*  ^  0  for  all  j  <  rut. 

We  first  prove  (4.6)  and  (4.7)  for  =  1. 

Let  Yf  ,...,Yf  be  the  Y^*’s  which  belong  to  D^,  then 

J1  J 

Nu-  +  qY  -  k<  Nj^  <  Nu-  +  -  fe  +  Ai,  (6.14) 
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*  I  ^  ^ 


N‘  +  Ai 


^jk  ^1*-  +  9i  —  k  +  Ai 


n  (1+ 


l<j<k 


iVi*_  +  9i  “  i  +  ^1  ^1*-  +  ?i  ^  +  Ai 
nr  /.  1  \8^(t)^'iu  /  .  /w  .  1  \J0 


2  n  +  (smce(l  +  ^)‘’'<‘>=lifft(0  =  0) 


n  (i+w, 


w;._  +  Ai 


l<j<k 


iVi*- +  9i  —j  -^1*- +  9i  —  A:  +  Ai 


(6.15) 


for  fc  =  1, ...,  qi.  Let  pi  be  such  that  <li<  Summing  up  each  term  in  (6.15) 

over  A:  =  1,  yields 


E  n  (*  + 


fc=i i<i<fe 


Ni^-  +  qi  —  j  +  Ai  iVi*_  +  qi  —  k  +  Ai 


/'~P'  n  (i+±)«‘'AS_ 
n  ('+w. 


Af!.-  +  Ai 


fc=l l<j<k 


AVi*_  +  ~~  j  —  A;  +  Ai 


(6.16) 


The  second  summation  in  (6.16)  is  due  to  Nl  =  I3k(t)  =  0  if  <  Pi  —  —  9?  and  if 


k  i  O'l, Jq-}-  That  is, 


Tt  Pi -9^-91 


fe=i  i<j<jk 


k=l  l<j<k 


Nj>  {Nk^'^W 


By  an  induction  argument  on  g®,  we  can  show  that 


^  TT  +  -yl- 

k>pi—q^—qi  k 


Simplifying  the  first  and  the  third  expression  in  (6.16)  yields 


■^1*-  ^  ^  1  \/3j(f)  -^fc  ^  AV'f*-+Ai  ('0 1  o'! 

— ncS  E  11  (I+wT)  lATAAW  W,.  +a,' 


iVi*_  +  Ai 


fc=l  l<J<ft 


AT/  (Arfc)/3fcW  -  iVi,_  +  Ai' 
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Inequality  (6.14)  also  yields 


TT  (1  + - 1 - )<  n  + 

^  Nu-  +  qi-j  +  ^i  0 


(6.19) 


-  n  (1  + jvi*_+9r-P’ 


i<i<9i 


Simplifying  the  first  and  the  third  product  in  (6.19)  yields 

^Nu-  +  Qi  +  Ai^  ^  -pj  ^  ^  fNu-  +Qi  \ 

^  iVx*_  +  A— 11,  0^ 


(6.17)  and  (6.20)  yield 


piVi^.+gi+Ai  Nl^  ^  ^  TT  (  1  xft(t)  Nl 

iV,._  +  A, 

0-^1*- +9i  ^1* 


<9i 


Nu- 


(6.20) 


(6.21) 


(6.18)  and  (6.21)  yield 

_ o/iVi»-  +  gi  +Al^  iVi^, 

+  Ai  .^1*-  +  Ai 

<V  TT  (1  +  — _ — _ 

-2^  11  ^  iV/  (iVfc)/5^0) 

^  _  iVj*_  +  Ai  -^1*  ('fi  oo'v 

-  <‘r<nz;  ^ 


It  can  be  seen  that  if  iV«,*  =  0,  (6.22)  is  still  true.  (6.22)  is  equivalent  to  (4.6)  and  (4.7) 
when  ruf  =  1.  (The  above  arguments  are  similar  to  the  induction  argument  when  rn  =  1 
in  the  proof  of  Lemma  6.1.) 

Mimicing  the  arguments  between  (6.14)  through  (6.22),  we  can  show  inductively  that 
(4.6)  and  (4.7)  hold,  o 
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